Gary,
Thanks for the pointers. A little sharing for the benefit of others that might struggle with this too:
I did a short-cut version of what you suggested:
// Added to hook in my EntityResolver
InputSource inputSource = new InputSource(reader);
XMLReader xmlReader = XMLReaderFactory.createXMLReader();
xmlReader.setEntityResolver(entityResolver);
// END Added to hook in my EntityResolver
SAXSource saxSource = new SAXSource(xmlReader, inputSource);
transformer.transform(saxSource, new StreamResult(xalanOutStream));
Where enityResolver is an instance of MyEntityResolver:
public class MyEntityResolver implements org.xml.sax.EntityResolver {
// Maintains a cache of entities so they only have to be fetched once.
// Each cache entry contains a byte array representing the contents of
// the entity, keyed by systemId.
Map entityCache;
// Constructor
public MyEntityResolver(){
// Initialize entityCache. Synchronized because multiple threads may be reading/writing the cache...
entityCache = Collections.synchronizedMap(new HashMap());
}
// resolve and entity based on its systemId. Not sure what publicId is for here?
public org.xml.sax.InputSource resolveEntity (String publicId, String systemId)
throws org.xml.sax.SAXException, java.io.IOException {
BufferedInputStream sourceIn;
// Look in cache to see if we have already fetched this baby before.
byte bytesIn[] = (byte[])entityCache.get(systemId);
// If not previously fetched, fetch and cache.
if(bytesIn == null){
// Look in the configuration to see if this entity has been mapped to a local copy.
String localCopy = config.getEntity(systemId);
if(localCopy != null){
// return a special input source
String localFileName = config.getCachedEntityDir()+localCopy;
sourceIn = new BufferedInputStream(new FileInputStream(localFileName));
} else {
// Assumes the systemId is a URL...
URL u = new URL(systemId);
sourceIn = new BufferedInputStream(u.openStream());
}
// Read the bytes from the entity into a byte array and cache it.
// This assumes available() returns the total number of bytes available
// in the underlying resource.
int numBytes = sourceIn.available();
bytesIn = new byte[numBytes];
int offset = 0;
while (numBytes > 0){
int numBytesRead = sourceIn.read(bytesIn, offset, numBytes);
numBytes -= numBytesRead;
offset += numBytesRead;
}
// Cache bytes for this entity...
entityCache.put(systemId, bytesIn);
}
return new org.xml.sax.InputSource(new ByteArrayInputStream(bytesIn));
}
}
MyEntityResolver does the following:
- Looks in local storage for entities that are specified in its configuration. The config object above contains this information which was read from an XML config file...
- Otherwise fetches the entity via its URL.
- Caches the bytes of each entity for quick retrieval. IO to the source entity is only done once.
Thanks for the suggestions!
Regards,
-Chris.
Chris Raber, Systems Engineer, AvantGo Inc.
v: 248-554-9330, cell: 810-839-3684
http://www.avantgo.com/
-----Original Message-----
From: Gary L Peskin [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 26, 2001 12:26 AM
To: Raber Chris
Cc: [EMAIL PROTECTED]
Subject: Re: How to turn of validation and resolution of DTD entities?
Chris --
There are a few things you need to consider when using EntityResolvers:
(1) Two DOM (or DTM) trees are built: one for the stylesheet and one
for the input document. Do you want an EntityResolver for both or do
you only need it for one or the other? I'm going to assume in this
example that you only want it for the input document. If you need it
for the stylesheet, we'll have to jazz up this example.
(2) At stylesheet creation time, additional stylesheets can be brought
in using xsl:include and xsl:import. New readers are created for these
things that don't use your EntityResolver. To trap this, you'll need to
create a URIResolver that creates and returns a SAXSource that has your
EntityResolver hooked into that XMLReader. The same holds true for XML
input documents brought in at runtime with the document() function.
So, for this simple example, I'll assume that you only want the
EntityResolver at runtime and that there are no input documents brought
in with the document() function.
I'd code it like this (none of this is tested but should work :)). It's
taken more or less from the "Usage Patterns" page at
http://xml.apache.org/xalan-j/usagepatterns.html#sax
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXTransformerFactory;
import javax.xml.transform.sax.TransformerHandler;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory.createXMLReader
import org.apache.xalan.serialize.SerializerFactory;
import org.apache.xalan.serialize.Serializer;
import org.apache.xalan.templates.OutputProperties;
import javax.xml.transform.Result;
import javax.xml.transform.sax.SAXResult
// Instantiate a TransformerFactory.
TransformerFactory tFactory = TransformerFactory.newInstance();
// Cast the TransformerFactory to SAXTransformerFactory.
SAXTransformerFactory saxTFactory = (SAXTransformerFactory) Factory;
// Create a Transformer ContentHandler to handle transformation of the
XML Source
TransformerHandler transformerHandler
= saxTFactory.newTransformerHandler(new
StreamSource("foo.xsl"));
// Create an XMLReader and set its ContentHandler.
XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setContentHandler(transformerHandler);
// Set the ContentHandler to also function as a LexicalHandler, which
// can process "lexical" events (such as comments and CDATA).
reader.setProperty("http://xml.org/sax/properties/lexical-handler",
transformerHandler);
// Set your EntityResolver into the reader
reader.setEntityResolver(myEntityResolver);
// Set up a Serializer to serialize the Result to a file.
Serializer serializer = SerializerFactory.getSerializer
(OutputProperties.getDefaultMethodProperties("xml"));
serializer.setOutputStream(new java.io.FileOutputStream("foo.out"));
// The Serializer functions as a SAX ContentHandler.
Result result = new SAXResult(serializer.asContentHandler());
transformerHandler.setResult(result);
// Parse the XML input document.
reader.parse("foo.xml");
HTH,
Gary
Raber Chris wrote:
>
> I have a need to turn off resolution of DTD entities
> when not connected to a network. Also I am thinking
> that hitting http://www.w3.org/ every time we bump
> into a DTD reference is a lot of overhead anyway. =:-o
>
> Based on a bit of Googling, it appears that
> implementing an EntityResolver that redirects remote
> TCP/IP destinations to a local cache is the ticket. If
> there is another/easier way, please advise.
>
> Currently I am using StreamSource and StreamResult as
> arguments to Transformer.transform, which is most
> convenient. I'd like to avoid hooking together an
> underlying parser, etc., if possible. I've really
> appreciated the simplcity of using the higher level
> Trax apis, and would like to stay there if I can.
> Simple good...
>
> Is there a way to hook in my own
> org.xml.sax.EntityResolver via a property, or must I
> instantiate my own underlying SAX/DOM handlers... and
> explictly call setEntityResolver an the XMLReaders?
>
> If the latter, can someone provide basic instructions
> on how to string this together? Is the SAX2SAX example
> a good place to start?
>
> And does anyone have an example EntityResolver they
> would be willing to share?
>
> TIA,
>
> -Chris.
>
> PS: It would be real cool if it were possible to hook
> this via property settings...
>
> __________________________________________________
> Do You Yahoo!?
> Make international calls for as low as $.04/minute with Yahoo! Messenger
> http://phonecard.yahoo.com/
