Kevin Flynn wrote: > You guessed correctly that I use a "SYSTEM" DTD declaration, but I'm > confused as to what you mean when you say the "current working directory" > isn't the same as the temporary directory created to run the process > command. I hadn't assumed that it was - I thought that it would be the same > as the current working directory of the document into which the exported > content is being pasted. > > - My original document has a SYSTEM DTD declaration let's say > <!DOCTYPE foo SYSTEM "bar.dtd"> > - I export some of it using copyDocument, which generates a document > with an effectively identical DTD declaration: > <!DOCTYPE foobar SYSTEM "bar.dtd"> > - I then read in the exported document using the read command and try to > paste it using the paste command. The DTD path specified in the clipboard > contents is (I assume) identical to the DTD path in the main document - > why is there a problem?
What follows is an Unix example. Windows too has a concept of current working directory but, with Unix, it's simpler to explain. [1] After the login on my linux box, my current working directory is /home/hussein. --- $ pwd /home/hussein --- [2] From there, I start xxe using an xterm terminal. --- $ xxe & --- [3] I open document /home/hussein/docsrc/userguide/guide.xml in XXE. This document starts with: --- <!DOCTYPE guide SYSTEM 'userguide.dtd'> --- and there is a file called 'userguide.dtd' in /home/hussein/docsrc/userguide/. [**] Reminder, to make it simple, when a document is parsed, the URL of the document is used to resolve all relative URLs found in the document. [4] I use my super process-command to process /home/hussein/docsrc/userguide/guide.xml . This process command creates a copy of guide.xml in /tmp/xxe123456/. I have taken care to also copy userguide.dtd to /tmp/xxe123456/ in order not to have troubles with the complex perl script which will transform the copy. Let's call this copy /tmp/xxe123456/__doc.xml. [5] my super process-command has a "read" element which is used to load /tmp/xxe123456/__doc.xml [6] my super process-command is invoked within a macro-command which is used to paste the XML nodes found in the string returned by "read". As expected, this string starts with: --- <?xml version="1.0"?> <!DOCTYPE guide SYSTEM 'userguide.dtd'> --- But this string cannot be parsed as XML because: * When a string is parsed by XXE, all relative URLs found in the string are resolved against current working directory (which, in the case of this example, is /home/hussein). (Unlike a document, as string has no URL of its own. See [**]) * Processing "<!DOCTYPE guide SYSTEM 'userguide.dtd'>" occurs during parsing. /home/hussein/userguide.dtd is not found by the XML parser. Parsing fails. Conclusion: ~~~~~~~~~~~ * Do not use <!DOCTYPE> without a corresponding entry in an XML catalog used by XXE. * All this never happens with other types of schemas, which unlike DTDs, do not mix validation and parsing (for expanding entities). Other types of schemas are pure grammars. A DTD is at the same time a grammar and a specification of textual macros (like C's #define, #include, etc).

