Hello all!I month or so ago Lachlan and I had a discussion concerning how browsers interpret different ways of specifying charsets.
I've written a small test-page, located at: http://ne.keryx.se/xhtmldemo/encoding_tests.php(That server is primarily intended for the use of my students and I'd encourage you to set up a page of your own if you intend to use this more than once or twice.)
According to my tests Firefox *will* use the charset specified in the http-header over the one in the XML-prologue if a page is sent as application/xhtml+xml. (Or more exactly, regardless whether the page is sent as text/html or application/xhtml+xml.) As will Opera.
The meta-tag mostly seems to go ignored, at least for pages sent over http. There might be some issues with the servers default http-header settings or with the browsers cache that I have not tested. The results differ somewhat when I test on my laptop (apache on Win) and on the public server (apache on Linux), but the apache response headers are exactly the same for both!
Lars Gunther P.S. My script is attached. Make sure there are no whitespace before the opening php-tag as that will make it impossible to send http-headers.Title: Encoding demo * @license Creative Commons Share Alike */ define('ISO','iso-8859-1'); define('UTF','UTF-8'); // Set headers $httpHeaderChar = 'Not set'; if ( isset($_POST['httpHeaderChar']) ) { if (strcmp($_POST['httpHeaderChar'],'ISO') == 0 ) { $httpHeaderChar = ISO; } elseif (strcmp($_POST['httpHeaderChar'],'UTF') == 0 ) { $httpHeaderChar = UTF; } } $httpHeaderHtml = 'text/html'; // default if ( isset($_POST['httpHeaderType']) && strcmp($_POST['httpHeaderType'],'xhtml') == 0 ) { $httpHeaderHtml = 'application/xhtml+xml'; } $headerString = $httpHeaderHtml; $headerString.= ( $httpHeaderChar != 'Not set' ) ? '; charset='.$httpHeaderChar : ''; header('Content-type: '.$headerString); ob_start(); // Set XML-prologue $xmlPrologue = 'Not set'; if ( isset($_POST['xmlPrologue']) ) { if (strcmp($_POST['xmlPrologue'],'ISO') == 0 ) { $xmlPrologue = '<'.'?xml version="1.0" encoding="'.ISO."\"?>\n"; } else if (strcmp($_POST['xmlPrologue'],'UTF') == 0 ) { $xmlPrologue = '<'.'?xml version="1.0" encoding="'.UTF."\"?>\n"; } else if (strcmp($_POST['xmlPrologue'],'NOCHAR') == 0 ) { $xmlPrologue = '<'.'?xml version="1.0" '."?>\n"; } } if ( $xmlPrologue != 'Not set' ) echo $xmlPrologue; ?>
Test of different ways to specifying the encoding
The next paragraph should read "å, Å, ä, Ä, ö, Ö". But if the browser interprets the text using the wrong charset, it will look weird or be squares or question marks.
å, Å, ä, Ä, ö, Ö.
Current values
-
a.k.a.ISO-8859-1';
echo '
- Real encoding
- '.$realEncoding." \n"; echo '
- Http-headers
- '.$headerString." \n"; echo '
- XML-prologue
- '.htmlspecialchars($xmlPrologue)." \n"; echo '
- Meta-tag
- '.htmlspecialchars($metaString)." \n"; ?>
Apache response headers
