Hello Michael,
thanks a lot for your explanation, that helped a lot.
The purpose of iterating through that document is at the moment
just to get known to libxml2 and how to use the functions in principle.
I just made the changes you proposed and i can now see the
attributes/properties.
For reference, here is the new function show() with your suggestions.
I did not keep the formatting, as i only output it for learning
purposes:
void show(xmlNode* node, int indent) {
xmlNode* n;
int i;
xmlAttr* attr;
xmlChar* ac;
xmlChar* val;
for(n = node; n; n = n->next) {
if(n->type == XML_ELEMENT_NODE) {
for(i = 0; i < indent; i++) printf(" ");
printf("<<%s>>\n", n->name);
attr = n->properties;
while(attr) {
ac = xmlGetProp(n, attr->name);
for(i = 0; i < indent+2; i++) printf(" ");
printf("<%s><%s>\n", attr->name, ac);
xmlFree(ac);
attr = attr->next;
}
show(n->children, indent+2);
}
else if(n->type == XML_TEXT_NODE) {
for(i = 0; i < indent; i++) printf(" ");
val = xmlNodeGetContent(n);
printf("c:%i:<%s>\n", strlen(val), val);
xmlFree(val);
}
}
}
But it seems that too many text nodes are output, also for nodes that
do not have any content there is a text node with some whitespace characters
in it.
Do you know why this could happen? How can i skip them?
Here is the XML file and below it there is the output of the function above.
text nodes are of format "c:length:<text>".
<?xml version="1.0" encoding="UTF-8"?>
<root>
<node1>content of node 1</node1>
<node2/>
<node3 attribute="yes" foo="bar">this node has attributes</node3>
<node4>other way to create content (which is also a node)</node4>
<node5>
<node51 odd="no"/>
<node52 odd="yes"/>
<node53 odd="no"/>
</node5>
<node6>
<node61 odd="no"/>
<node62 odd="yes"/>
<node63 odd="no"/>
</node6>
</root>
Output:
<<root>>
c:3:<
<<node1>>
c:17:<content of node 1>
c:3:<
<<node2>>
c:3:<
<<node3>>
<attribute><yes>
<foo><bar>
c:24:<this node has attributes>
c:3:<
<<node4>>
c:50:<other way to create content (which is also a node)>
c:3:<
<<node5>>
c:5:<
<<node51>>
<odd><no>
c:5:<
<<node52>>
<odd><yes>
c:5:<
<<node53>>
<odd><no>
c:3:<
c:3:<
<<node6>>
c:5:<
<<node61>>
<odd><no>
c:5:<
<<node62>>
<odd><yes>
c:5:<
<<node63>>
<odd><no>
c:3:<
c:1:<
Thanks for any hints,
Torsten.
Regarding the text elements i still have some issues, it seems there
are some
Am Donnerstag, 14. Juni 2007 00:40 schrieben Sie:
> Hello, Torsten -
>
> You'll probably get other replies from the list, but here's a couple
> quick pointers to help you get started.
>
> Libxml uses a "loose polymorphism" approach in the node tree, as you've
> already noted -- you need to inspect the "type" field of the node to
> determine what you're dealing with. The tree isn't entirely contained
> by the next and children nodes, however; depending on the type of the
> node, you sometimes need to statically cast the pointer to get at the
> internals.
>
> The default node type, "xmlNode", is also the "Element" type, which is
> convenient because that's the most common case. An additional confusing
> detail is that the attribute list is named "properties" for some reason,
> which is one of those historical details that nobody can change now.
>
> Also, make certain not to confuse the DTD structures in tree.h with the
> node structures -- "xmlElement" and "xmlAttribute" are the definitions
> in the DTD, while "xmlNode" and "xmlAttr" are the actual nodes.
>
> In your case, you want code that looks like this (I'm doing this from
> memory, so excuse me if I get some of the capitalization and names wrong):
>
> if (n->type == XML_ELEMENT_NODE) {
> printf("<%s", n->name);
> xmlAttr *attr = n->properties;
> while (attr) {
> xmlchar *attrVal = xmlGetProp(n, n->name);
> // Note that I am skipping the handling of namespaces here; use
> the "nsDef" field to figure those out
> printf("%s=\"%s\" ", attr->name, attrVal);
> xmlFree(attrVal);
> attr = attr->next;
> }
> printf(">");
> show(n->children, indent+2);
> printf("</%s>", n->name);
> } else if (n->type == XML_TEXT_NODE) {
> xmlChar *val = xmlNodeGetContent(n);
> printf("%s", val);
> xmlFree(val);
> } else ... (handle XML_CDATA_SECTION_NODE, COMMENT_NODE, PI_NODE, etc...)
>
> So, a couple interesting things to note about this:
> 1. Attributes are found by walking the "properties" list of the node.
> We know it's there because our type matched ELEMENT_NODE.
> 2. We can't just print out the value of the attribute, because it might
> contain entity references (things like &). You could walk the list
> yourself if you were very clever, but it's much easier and safer to just
> call xmlGetProp which does all that for you. However, you need to free
> that memory when you're done with it, hence the call to xmlFree.
> 3. When we encounter a text node, we also need to resolve the entities,
> so we use the helpful "xmlNodeGetContent" function which does the same
> thing, and also needs to be cleaned up when we're done.
>
> Now, I should caution you that what you've done here is NOT the same as
> serializing the document back to XML! This effectively throws out all
> the careful entity escaping that was in the original document... you
> could have bogus attribute values, and bad characters in your text, as a
> result of this, so it's really not safe to treat this output as XML.
>
> If you really want to get the XML back, the easiest thing to do is to
> just serialize it out with one of the "xmlDocDump" or "xmlNodeDump"
> functions. There's a bunch of them and you can probably find one that
> does what you want.
>
> Hope that helps.
>
> Best -
> Michael
> --
> Cisco Systems/XML Engineering
> (formerly Reactivity, Inc.)
>
> Torsten Mohr wrote:
> > Now i wrote some code to read this file into memory and get its root node
> > and i'd like to output the document recursively. I want to do this to
> > get known to libxml2 and on how to iterate through a document:
> >
> >
> > void show(xmlNode* node, int indent) {
> > xmlNode* n;
> > int i;
> >
> > for(n = node; n; n = n->next) {
> > if(n->type == XML_ELEMENT_NODE) {
> > for(i = 0; i < indent; i++) printf(" ");
> > printf("<%s> <%s>\n", n->name, xmlIsBlankNode(n) ? "<empty>" :
> > xmlNodeGetContent(n));
> > show(n->children, indent+2);
> > }
> > if(n->type == XML_ATTRIBUTE_NODE) {
> > for(i = 0; i < indent; i++) printf(" ");
> > printf("<%s>+<%s>\n", n->name, xmlIsBlankNode(n) ? "<empty>" :
> > xmlNodeGetContent(n));
> > }
> > }
> > }
> >
> >
> > It does not exactly do what i want, i can't see any attributes like
> > foo="bar" or others. Also, for nodes that do not have text, some empty
> > lines are printed, not the string "<empty>" as i want it to be.
> >
> >
> > I hope i don't mix up names, i'm not sure when to use attribute and
> > when property.
> >
> >
> > For using libxml2 in an own program i'd like to know how to:
> > - test if a node has a content or not
> > - test what attributes (or properties?) a node has
> >
> > It would be great if anybody could give me a hint on how to do this.
> >
> >
> > Best regards,
> > Torsten.
> > _______________________________________________
> > xml mailing list, project page http://xmlsoft.org/
> > [email protected]
> > http://mail.gnome.org/mailman/listinfo/xml
-------------------------------------------------------
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml