Alexander Dupuy wrote: > In URLs, the character '#' is used to indicate a fragment-id. > > However, when specifying a filename (rather than a URL), '#' is a valid > file character, although it may need to be quoted to protect it from the > shell. > > Notwithstanding the conventions of URLs, it should be possible to > specify such a filename on the command line for XXE or the validation > tools dtdvalid, xsdvalid, etc. > > Regardless of the existence of a file /tmp/#test.xml, I get: > > $ dtdvalid /tmp/\#test.xml > cannot load '/tmp/#test.xml': file:/tmp/#test.xml:1:0: syntax error > > Running xxe /tmp/\#test.xml gives a similar error in a popup. > > While it is possible to successfully %-encode the '#' for dtdvalid, e.g. > dtdvalid /tmp/%23test.xml, it means that if there were a file with the > name /tmp/%23test.xml, I would have to use the command dtdvalid > /tmp/%2523test.xml to validate it. This is difficult and awkward in > shell scripts, and in any case, doesn't work with XXE, which gives a > pop-up with the error: > > "/tmp/%23test.xml" is not an URL or a file name. > > It is possible to get XXE to open file:/tmp/%23test.xml, but it will > only open it read-only (just as well, I have no idea what filename it > would write it out as if I saved it). > > It seems to me that in any context where you accept a URL or a file > name, if there is no leading file: (or other URL scheme) you should > treat the name as a file name, and convert it to a URL by %-escaping any > reserved characters. > > I realize this is a minor quibble, but until recently (XXE 2.4?) this > used to work correctly, and when it changed, it broke my CVS commit > validation scripts that ran validation on temporary files with # in the > pathname. I've worked around this by eliminating the '#', but it may > cause confusion or problems for others in the future.
"/tmp/%23test.xml" is definitely a file called "/tmp/%23test.xml", not a file called "/tmp/#test.xml". Your problem comes from the fact that XXE works with URLs not with file names and that "/tmp/#test.xml" is internally converted to "file:/tmp/" (the "#test.xml" fragment is removed). Sorry, but you'll have to consider this as being a limitation of XXE.

