Alexander Dupuy wrote:
> In URLs, the character '#' is used to indicate a fragment-id.
> 
> However, when specifying a filename (rather than a URL), '#' is a valid 
> file character, although it may need to be quoted to protect it from the 
> shell.
> 
> Notwithstanding the conventions of URLs, it should be possible to 
> specify such a filename on the command line for XXE or the validation 
> tools dtdvalid, xsdvalid, etc.
> 
> Regardless of the existence of a file /tmp/#test.xml, I get:
> 
> $ dtdvalid /tmp/\#test.xml
> cannot load '/tmp/#test.xml': file:/tmp/#test.xml:1:0: syntax error
> 
> Running xxe /tmp/\#test.xml gives a similar error in a popup.
> 
> While it is possible to successfully %-encode the '#' for dtdvalid, e.g. 
> dtdvalid /tmp/%23test.xml, it means that if there were a file with the 
> name /tmp/%23test.xml, I would have to use the command dtdvalid 
> /tmp/%2523test.xml to validate it.  This is difficult and awkward in 
> shell scripts, and in any case, doesn't work with XXE, which gives a 
> pop-up with the error:
> 
> "/tmp/%23test.xml" is not an URL or a file name.
> 
> It is possible to get XXE to open file:/tmp/%23test.xml, but it will 
> only open it read-only (just as well, I have no idea what filename it 
> would write it out as if I saved it).
> 
> It seems to me that in any context where you accept a URL or a file 
> name, if there is no leading file: (or other URL scheme) you should 
> treat the name as a file name, and convert it to a URL by %-escaping any 
> reserved characters.
> 
> I realize this is a minor quibble, but until recently (XXE 2.4?) this 
> used to work correctly, and when it changed, it broke my CVS commit 
> validation scripts that ran validation on temporary files with # in the 
> pathname.  I've worked around this by eliminating the '#', but it may 
> cause confusion or problems for others in the future.

"/tmp/%23test.xml" is definitely a file called "/tmp/%23test.xml", not a 
file called "/tmp/#test.xml".

Your problem comes from the fact that XXE works with URLs not with file 
names and that "/tmp/#test.xml" is internally converted to "file:/tmp/" 
(the "#test.xml" fragment is removed).

Sorry, but you'll have to consider this as being a limitation of XXE.



Reply via email to