Hi Liam, I misdiagnosed the problem. The problem actually seems to be that the XML file I am parsing has a file entity whose path contains a Unicode character that needs to be escaped.
Here is the XML I am trying to parse: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "W:/matlab/sys/namespace/docbook/v4/dtd/docbookx.dtd" [ <!ENTITY sect-002 SYSTEM "./uc£_html_files/image-000-chapter.xfrag"> ]> <book lang="en"> <?dbhtml filename="uc£.html"?> <bookinfo><title></title><subtitle></subtitle><pubdate>31-Jul-2022 11:08:41</pubdate></bookinfo>§-002;</book> Here is the error returned by the parser. "Entity 'sect-002' failed to parse\n" The parser escapes high-order characters in the URL for the main XML file but apparently does not do the same for file entities declared in the DTD. I am currently trying to convert a Xerces-c/Xalan-c application to libxml/xslt. This is because Xalan-c is unable to execute the Docbook FO stylesheet. My Xerces-c implementation uses a custom entity resolver to resolve file entities. I might need a custom entity resolver to fix the problem with the libxml2 implementation. However, libxml2 does not seem to support custom entity resolvers. At lease, I have not been able to find this feature in the doc or the libxml2 code base on GitHub. I would appreciate any help you can give to finding a solution., Regards, Paul From: Liam R E Quin <l...@holoweb.net> Sent: Saturday, July 30, 2022 4:02 PM To: Paul Kinnucan <pa...@mathworks.com>; xml@gnome.org Subject: Re: [xml] How can I parse an XML file whose filesystem path is a Unicode string? On Sat, 2022-07-30 at 17:15 +0000, Paul Kinnucan via xml wrote: > Hi, > > I need to parse XML files whose paths may contain Unicode characters, > for example, > > W:\jtbug\uc£\mydoc£.xml > > What is the best way to do this with libxml2? Sounds like you are using Microsoft Windows and are going to use the C API? How far have you got? What problems are you having exactly? What errors do you get? -- Liam Quin, https://www.delightfulcomputing.com/<https://www.delightfulcomputing.com> https://www.paligo.net/<https://www.paligo.net> Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Antique illustrations, stock images, text: http://www.fromoldbooks.org<http://www.fromoldbooks.org>
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml