Hi everyone

First of all, Gottfried, Thanks a lot for you help.

So I understand that your solution requires change xerces/crimson code a
bit...

>>> i havbe removed some internal checks - so this code is note
>>> compilable.

...mmm unfortunatelly thats the kind of thing I would like to avoid. What I
do now is just get rid of the <!DOCTYPE! ...> line from the input before
process it (using an specialization of FilterInputStream and Jakarta-oro
regex library). what is not a nice solution either, but at least I avoid
modify xerces internal code, what I would have to do for every new release.

I have got the impression that is no way to do it nicely at the moment, that
an application using xerces/crimson and dealing with doctypes with DTD-URL
pointing to internet has to be online. Is that true?

My application has to deal with about 200 xml and merge them. Letting xerces
fetch the dtd every time, it needs about 10 minutes to do the job. Avoiding
it (with my solution), it takes just 15 seconds!

I also wonder if there is any way of caching the DTDs to avoid fetch the
same one again and again; actually all the input xmls have the same DTD.

           regards, Valentin.

-----Original Message-----
From: Gottfried Szing [mailto:[EMAIL PROTECTED]
Sent: 16 April 2002 12:17
To: [EMAIL PROTECTED]
Subject: Re: Avoid network access fetching the DTD


On Tue, 2002-04-16 at 12:19, Valentin Ruano wrote:
> Hi everyone,
>
> Any body knows how avoid any network access when parsing a XML file with
> Xerces or Crimson. The application I am developing has to deal with full
> qualified XML sources (I mean with public dtd URLs) and must work without
> network connection. I know that I would lose the syntax check, but that is
> not important.

i am using a modified EnityResolver which checks first the location of
the dtd/xsd and if this is not a http request, the default resolver is
called. the class Check is a local class which verifies if the file is
local. i havbe removed some internal checks - so this code is note
compilable.

import org.xml.sax.EntityResolver;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import java.io.File;
import java.io.IOException;
import java.io.InputStream;

/**
 * @author Gottfried Szing
 * @version $Revision: 1.5 $ $Name:  $
 */
public class ESIEntityResolver
        implements EntityResolver
{
    private EntityResolver defaultres = null;

    /**
     * inits the resolver
     */
    public ESIEntityResolver()
    {
        defaultres = new DefaultHandler();
    }

    /**
     * This attempts to resolve the entity associated with the specified
     * public and system ids. If the systemId is empty, then we use the
     * publicId to locate the URL of the cataloged DTD file.
     */
    public InputSource resolveEntity(String publicId, String systemId)
            throws SAXException, IOException
    {
        if (systemId != null || publicId != null)
        {
            if (Check.isLocal(systemId) && Check.isLocal(publicId))
                 return  defaultres.resolveEntity(publicId,systemId);
        }

        return null;
    }

    /**
     * Return the URL of the DTD corresponding to the systemId.
     */
    private static final String getUrl(String systemId)
    {
        if (null == systemId)
            return null;

        File file = new File(systemId);
        String name = file.getName();
        return name;
    }
}


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to