Mark Schmit wrote:
I'm trying what I thought would be a very simple task: given an XPath
and a string containing an XML file's contents, produce a string
containing only the XPath-matching excerpts.  I tried the following,
based heavily on the SimpleXPathAPI and XPathWrapper samples:

================
XMLPlatformUtils::Initialize();
XPathEvaluator::initialize();
XalanNode* context_node = NULL;
{
  XPathEvaluator xpath_evaluator;
  XalanSourceTreeDOMSupport dom_support;
  XalanSourceTreeParserLiaison parser_liaison;
  dom_support.setParserLiaison(&parser_liaison);

  MemBufInputSource input_source(
      reinterpret_cast<const XMLByte*>(full_xml_contents.c_str()),
      full_xml_contents.length(), "what_is_buf_id");

  XalanDocument* doc = parser_liaison.parseXMLStream(input_source);
  ASSERT(doc) << "Failed to create doc";
  XalanDocumentPrefixResolver prefix_resolver(doc);
  context_node = xpath_evaluator.selectSingleNode(dom_support, doc,

XalanDOMString(context).c_str(),
                                                  prefix_resolver);
  if (!context_node) {
    LOG << "Failed to find node for " << context;
  } else {
    const XObjectPtr result(xpath_evaluator.evaluate(dom_support,
                                                     context_node,

XalanDOMString("*").c_str(),
                                                     prefix_resolver));
    ASSERT(!result.null());
    LOG << "Result type: " << result->getTypeString();
    const NodeRefListBase& nodeset = result->nodeset();
    for (int i = 0; i < nodeset.getLength(); ++i) {
      XalanNode* node = nodeset.item(i);
      LOG << "Node " << i << ": " << node->getNodeName();
      XalanDOMString str;
      DOMServices::getNodeData(*node, str);
      LOG << "Node " << i << " contents: " << str;
      // TODO: Append to a single return string
    }
  }
}
XPathEvaluator::terminate();
XMLPlatformUtils::Terminate();
================

This prints each of the nodes' contents, albeit with all of the tags'
contents stripped out.  My problems are that: A) I want to keep all
the XML tags, and B) I want to put them into one aggregated string.
For example, given the the following XML file:
I'm a little confused with what you mean by "all of the tags' contents stripped out." Do you mean the tags are stripped out? If so, that's expected, because you're working with the XPath data model, and not with markup.


================
<?xml version="1.0" encoding="UTF-8"?>
<GenInfo xmlns="urn:mynamespace">
  <EntityId>ABC123</EntityId>
  <EntityName>My Favorite Entity</EntityName>
  <MembersInfo>
    <Member ID="123456">
      <Name>Bob Smith</Name>
    </Member>
    <Member ID="234567">
      <Name>Jane Doe</Name>
    </Member>
  </MembersInfo>
</GenInfo>
================

I'd like to produce this:
================
<Member ID="123456">
  <Name>Bob Smith</Name>
</Member>
<Member ID="234567">
  <Name>Jane Doe</Name>
</Member>
================
I think you're trying to take the nodes and serialize an external parsed entity. Take a look at the SerializeNodeSet sample for more information. Unfortunately, Xalan-C's serializer does not implement namespace fixup, so you will have trouble serializing documents that use namespaces.


I looked at the DOMServices and DOMSupport classes and couldn't find
anything that produced a string of XML from a given XalanNode.  Do I
need to use the XalanTransformer class in some way?  Should I
essentially be generating an XSL transformation rather than using
XPathEvaluator?
Well, running a stylesheet would certainly take care of a lot of the messy details with using XPathEvaluator and doing serialization, but it depends on whether your XPath expression are dynamic or static.


Also, how does the 'urn:mynamespace' aspect figure into this?  Does
that get passed to the prefix resolver?
Yes, the XPath process will need to know the bindings for namespaces. Note there's a simple PrefixResolver implementation, called ElementPrefixResolverProxy that will take an instance of a XalanElement and resolve namespace bindings using the prefixes defined on that element. Note this only works if all of the prefixes you need are defined on a single element. Note also that it doesn't work with documents that use default namespace bindings, because XPath 1.0 requires you use a prefix for QNames.


Finally, is there any documentation online that features descriptions
of the classes or do I need to infer everything from class names?  I
spent a ton of time trying to figure this out from the API docs but
the classes feature almost no descriptions of what role they actually
serve, or how to perform activities that seem to me to be pretty
fundamental (e.g. seeing the XML text representation of a given node
and its children).
We really need a "programmer's guide," but no one has ever volunteered to write one, and I just don't have the time to do it. The sample applications are really the best place to start. As for your specific example, "serialization" is the common term for generating markup from an instance of the data model, hence the "SerializeNodeSet" sample application.

Dave

Reply via email to