Curt,
I tried different upper case for ISO-8859-1, but this didn't work. This
gave me the idea that other aliases for internal transcoders might work.
After digging through Xerces code for a little while I found where they
declare all the aliases for internal transcoders. It looks like LATIN1
works. LATIN1 looks like it is just an alias for ISO-8859-1 so I have no
idea why it works and ISO-8859-1 doesn't. Thanks for your help.
Scott Paulinski
>From: "Arnold, Curt" <[EMAIL PROTECTED]>
>Reply-To: [EMAIL PROTECTED]
>To: "'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]>
>Subject: RE: Xerces Internationalization
>Date: Fri, 17 Aug 2001 13:55:50 -0600
>
> > 1) Include no header in the XML file being read. This results in
> > non-English characters being read in as a ? character.
>
>I'm surprised that you didn't get an encoding exception since ISO-8859-1
>code points would rarely be legal UTF-8.
>
> >
> > 2) Including the header <?xml version="1.0"
> > encoding="iso-8859-1" ?>. This
> > causes the file not to be read at all. Looking at the Xerces
> > code I was
> > able to track down one of the problems to the way Xerces
> > detects codepages
> > in Win32TransService.cpp. In the constructor it checks for
> > the codepages on
> > the machine by looking in the registry under
> > HKCR\MIME\Database\Codepage
> > (and Charset), which doesn't exist on a base Windows 95 system.
>
>There does seem to be a internal ISO-8859-1 transcoder, but it looks like
>it might be sensitive to capitalization. What happens if you use
>encoding="ISO-8859-1"?
>
> >
> > I was able to add this set of registry keys by installing IE
> > 4.01, but the
> > iso-8859-1 encoding still doesn't work for non-English
> > characters. In this
> > case Xerces ignores the entire file if it contains such characters.
> > Unfortunately, the 1252 codepage (which is what iso-8859-1
> > looks like it is
> > mapped to) appears to be the only one installed on this
> > version of Windows
> > 95. The 1252 codepage is named "Western European (Windows)"
> > in the registry
> > which sounds like the character set I am looking for.
> > Looking at Xerces
> > documentation it looks like they support iso-8859-1 as "ISO
> > Latin 1" which
> > sounds promising as well. So it looks like I am using the
> > proper codepage,
> > but it just isn't working for some reason.
>
>CP-1252 is ISO-8859-1 + plus a few additional characters between 0x82 and
>0x8C and 0x91 and 0x9C and 0x9F.
> >
> > On a side note, I found that using iso-8859-3 (1254) does
> > allow Xerces to
> > use these non-English characters. Though this encoding is
> > not installed on
> > these Windows 95 systems. If anyone knows an easy way to
> > install this
> > encoding (without installing a whole application like IE)
> > that would be
> > helpful as well.
> >
> > Any help is greatly appreciated.
>
>There are also a few unnecessary dependencies on IE 4 components (urlmon
>and wininet) in the COM wrapper.
>
>For equivalence with MSXML, the COM wrapper provides an XMLHttpRequest
>object that is implemented using WININET. Unfortunately, this causes
>xml4com.dll not to load if IE4+ isn't present even if you
>weren't planning on using XMLHttpRequest. I have a personal copy that has
>rewritten XMLHttpRequest so that it dynamically loads WININET if and only
>if you try to do something with XMLHttpRequest.
>
>Also, XMLDOMDocument makes calls to PathIsURL, PathIsRelative and
>URLDownloadToCacheFile in urlmon. PathIsURL and PathIsRelative can both be
>trivially implemented locally. For
>URLDownloadToCacheFile, my proxy will return the local file name if the URL
>is a local file and dynamically load urlmon if the url is remote. This at
>least allows you to parse local files without
>having IE present.
>
>Since the COM wrapper is moderately comatose and Win95 without IE 4 even
>more so, I haven't prep'd these changes for inclusion in the CVS. However,
>if you would like them as is, let me know.
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]