I think you can just set the http://apache.org/xml/features/nonvalidating/load-external-dtd feature to false. That's what we did to stop Xerces from requiring an internet connection.
> -----Original Message----- > From: Valentin Ruano [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 16, 2002 9:26 AM > To: [EMAIL PROTECTED] > Subject: RE: Avoid network access fetching the DTD > > > Hi everyone > > First of all, Gottfried, Thanks a lot for you help. > > So I understand that your solution requires change > xerces/crimson code a bit... > > >>> i havbe removed some internal checks - so this code is note > >>> compilable. > > ...mmm unfortunatelly thats the kind of thing I would like to > avoid. What I do now is just get rid of the <!DOCTYPE! ...> > line from the input before process it (using an > specialization of FilterInputStream and Jakarta-oro regex > library). what is not a nice solution either, but at least I > avoid modify xerces internal code, what I would have to do > for every new release. > > I have got the impression that is no way to do it nicely at > the moment, that an application using xerces/crimson and > dealing with doctypes with DTD-URL pointing to internet has > to be online. Is that true? > > My application has to deal with about 200 xml and merge them. > Letting xerces fetch the dtd every time, it needs about 10 > minutes to do the job. Avoiding it (with my solution), it > takes just 15 seconds! > > I also wonder if there is any way of caching the DTDs to > avoid fetch the same one again and again; actually all the > input xmls have the same DTD. > > regards, Valentin. > > -----Original Message----- > From: Gottfried Szing [mailto:[EMAIL PROTECTED] > Sent: 16 April 2002 12:17 > To: [EMAIL PROTECTED] > Subject: Re: Avoid network access fetching the DTD > > > On Tue, 2002-04-16 at 12:19, Valentin Ruano wrote: > > Hi everyone, > > > > Any body knows how avoid any network access when parsing a XML file > > with Xerces or Crimson. The application I am developing has to deal > > with full qualified XML sources (I mean with public dtd > URLs) and must > > work without network connection. I know that I would lose > the syntax > > check, but that is not important. > > i am using a modified EnityResolver which checks first the > location of the dtd/xsd and if this is not a http request, > the default resolver is called. the class Check is a local > class which verifies if the file is local. i havbe removed > some internal checks - so this code is note compilable. > > import org.xml.sax.EntityResolver; > import org.xml.sax.InputSource; > import org.xml.sax.SAXException; > import org.xml.sax.helpers.DefaultHandler; > > import java.io.File; > import java.io.IOException; > import java.io.InputStream; > > /** > * @author Gottfried Szing > * @version $Revision: 1.5 $ $Name: $ > */ > public class ESIEntityResolver > implements EntityResolver > { > private EntityResolver defaultres = null; > > /** > * inits the resolver > */ > public ESIEntityResolver() > { > defaultres = new DefaultHandler(); > } > > /** > * This attempts to resolve the entity associated with > the specified > * public and system ids. If the systemId is empty, then > we use the > * publicId to locate the URL of the cataloged DTD file. > */ > public InputSource resolveEntity(String publicId, String systemId) > throws SAXException, IOException > { > if (systemId != null || publicId != null) > { > if (Check.isLocal(systemId) && Check.isLocal(publicId)) > return defaultres.resolveEntity(publicId,systemId); > } > > return null; > } > > /** > * Return the URL of the DTD corresponding to the systemId. > */ > private static final String getUrl(String systemId) > { > if (null == systemId) > return null; > > File file = new File(systemId); > String name = file.getName(); > return name; > } > } > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
