Kevin Flynn wrote:
> You guessed correctly that I use a "SYSTEM" DTD declaration, but I'm
> confused as to what you mean when you say the "current working directory"
> isn't the same as the temporary directory created to run the process
> command. I hadn't assumed that it was - I thought that it would be the same
> as the current working directory of the document into which the exported
> content is being pasted.
> 
> - My original document has a SYSTEM DTD declaration let's say
>       <!DOCTYPE foo SYSTEM "bar.dtd">
> - I export some of it using copyDocument, which generates a document
>   with an effectively identical DTD declaration:
>       <!DOCTYPE foobar SYSTEM "bar.dtd">
> - I then read in the exported document using the read command and try to
>   paste it using the paste command. The DTD path specified in the clipboard
>   contents is (I assume) identical to the DTD path in the main document -
>   why is there a problem?

What follows is an Unix example. Windows too has a concept of current 
working directory but, with Unix, it's simpler to explain.

[1] After the login on my linux box, my current working directory is 
/home/hussein.

---
$ pwd
/home/hussein
---

[2] From there, I start xxe using an xterm terminal.

---
$ xxe &
---

[3] I open document /home/hussein/docsrc/userguide/guide.xml in XXE.

This document starts with:

---
<!DOCTYPE guide SYSTEM 'userguide.dtd'>
---

and there is a file called 'userguide.dtd' in 
/home/hussein/docsrc/userguide/.

[**] Reminder, to make it simple, when a document is parsed, the URL of 
the document is used to resolve all relative URLs found in the document.

[4] I use my super process-command  to process 
/home/hussein/docsrc/userguide/guide.xml .

This process command creates a copy of guide.xml in /tmp/xxe123456/.

I have taken care to also copy userguide.dtd to /tmp/xxe123456/ in order 
not to have troubles with the complex perl script which will transform 
the copy.

Let's call this copy /tmp/xxe123456/__doc.xml.

[5] my super process-command has a "read" element which is used to load 
/tmp/xxe123456/__doc.xml

[6] my super process-command is invoked within a macro-command which is 
used to paste the XML nodes found in the string returned by "read".

As expected, this string starts with:

---
<?xml version="1.0"?>
<!DOCTYPE guide SYSTEM 'userguide.dtd'>
---

But this string cannot be parsed as XML because:

* When a string is parsed by XXE, all relative URLs found in the string 
are resolved against current working directory (which, in the case of 
this example, is /home/hussein).

(Unlike a document, as string has no URL of its own. See [**])

* Processing "<!DOCTYPE guide SYSTEM 'userguide.dtd'>" occurs during 
parsing. /home/hussein/userguide.dtd is not found by the XML parser. 
Parsing fails.




Conclusion:
~~~~~~~~~~~

* Do not use <!DOCTYPE> without a corresponding entry in an XML catalog 
used by XXE.

* All this never happens with other types of schemas, which unlike DTDs, 
do not mix validation and parsing (for expanding entities).

Other types of schemas are pure grammars. A DTD is at the same time a 
grammar and a specification of textual macros (like C's #define, 
#include, etc).






Reply via email to