Title: Need help traversing DOM tree with xerces in Perl
Thanks a lot.  This did display my tree and gave me some pointers
on how to proceed.
 
PG
-----Original Message-----
From: ted sandler [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 6:04 PM
To: [EMAIL PROTECTED]
Subject: Re: Need help traversing DOM tree with xerces in Perl

FYI, perl's object system is not so hot, so if you want to do any real OO programming (writing classes that use inheritance, etc.) in perl, you should buy Damien Conway's book on object oriented perl.  It provides work arounds for many of the shortcomings in perl's object system.
 
As for traversing the dom tree in perl, I've included my code for how I did it (this code was initially based on some sample code provided in the Xerces-Perl distribution).  My program defines a generic tree traversing subroutine that takes a DOM tree and a reference to a subroutine as arguments.  The tree traversing subroutine then applies the referenced subroutine to each node of the tree (kind of like the way perl's `map' construct is used to apply a block of code to every element in a list).  Anyway, the thing you'll be interested in is the DOM_traverse( ) function.
 
Good luck,
-ted
 
---------------------------------------------
#!/usr/bin/perl -w
use strict;
 
use XML::Xerces;
use Getopt::Long;
 

## INIT STUFF ##
 
   our %OPTIONS;
   my $USAGE = <<EOU;
   USAGE: $0 [-v=xxx][-n] file
   Options:
    -v=xxx      Validation scheme [always | never | auto*]
    -n          Enable namespace processing. Defaults to off.
    -s          Enable schema processing. Defaults to off.
 
    * = Default if not provided explicitly
 
EOU
 
   my $rc = GetOptions(\%OPTIONS, 'v=s', 'n', 's');
   die $USAGE unless $rc;
   die $USAGE unless scalar @ARGV;
   my $file = $ARGV[0];
   -f $file or die "File '$file' does not exist!\n";
   my $namespace = $OPTIONS{n} || 0;
   my $schema = $OPTIONS{s} || 0;
   my $validate = $OPTIONS{v} || 'auto';
 
   if (uc($validate) eq 'ALWAYS') {
     $validate = $XML::Xerces::DOMParser::Val_Always;
   } elsif (uc($validate) eq 'NEVER') {
     $validate = $XML::Xerces::DOMParser::Val_Never;
   } elsif (uc($validate) eq 'AUTO') {
     $validate = $XML::Xerces::DOMParser::Val_Auto;
   } else {
     die("Unknown value for -v: $validate\n$USAGE");
   }
## END INIT ##
 

my $parser = XML::Xerces::DOMParser->new();
 
$parser->setValidationScheme ($validate);
$parser->setDoNamespaces ($namespace);
$parser->setCreateEntityReferenceNodes(1);
$parser->setDoSchema ($schema);
 
my $error_handler = XML::Xerces::PerlErrorHandler->new();
$parser->setErrorHandler($error_handler);
 
 
 
eval { $parser->parse(XML::Xerces::LocalFileInputSource->new($file)) };
myDie($@) if $@;
 
my $doc = $parser->getDocument();
my $rootEl = $doc->getDocumentElement();
DOM_traverse($rootEl, \&printPretty);
 
exit(0);
 

sub DOM_traverse
{
    my ($node, $subr, $dpth) = @_;
 
    $dpth ||= 0;
 
    $subr->($node, $dpth) if ($subr);
 
    my $kid = $node->getFirstChild;
 
    while ( not $kid->isNull )
    {
        DOM_traverse($kid, $subr, $dpth + 1);
        $kid = $kid->getNextSibling();
    }
}
 

sub printPretty
{
    my ($node, $dpth) = @_;
 
    # return if empty text node
    return if ($node->getNodeName eq "#text" &&
               $node->getNodeValue =~ /^\s*$/);
 
    my $indent = ("   " x $dpth);
 
    # print node name and value
    print ($indent.$node->getNodeName."=".$node->getNodeValue."\n");
 
    # return unless an element-node (cuz only elements have attributes)
    return unless $node->isa('XML::Xerces::DOM_Element');
 
    my %attrs = $node->getAttributes->to_hash();
 
    return unless (%attrs); # return if no attributes
 
    print "$indent <";
    while (my ($name, $value) = each %attrs) {
        print " $name=\"$value\"";
    }
    print " >\n";
};
 

sub myDie
{
    my $err = shift;
 
    ref($err) and die $err->getMessage();
 
    die $err;
}
__END__
 
 
 
----- Original Message -----
From: Pam Gage
Sent: Thursday, April 18, 2002 8:24 PM
Subject: Need help traversing DOM tree with xerces in Perl

I know there's got to be a way to do this, but I am
having a dismal time trying to just print each node name
for an xml document.  Can someone please help?  My program
is breaking at the point marked by ***.  It can't find auto? 
What does this mean?

Environment:  Windows NT
              ActiveState v5.6.1
              Xerces v1.3.3 (also from ActiveState)

Also, if someone has an example of doing a tree traversal, it
would be very helpful.  I have done this many times in Java,
but cannot seem to get the Perl equivalent working.  I am not
very familiar with "object oriented" Perl.

Here's my program.  Any help is appreciated.

PG

--------------------

#!c:/ActiveState/bin/Perl -w

use XML::DOM;
use XML::Xerces;

$fullpath = "test.xml";

my $parser = new XML::Xerces::DOMParser;

$parser->setDoValidation(1);

$parser->parse(XML::Xerces::LocalFileInputSource->new($fullpath));

my $doc = $parser->getDocument();

# Get the root node
my $nodes = $doc->getElementsByTagName("component");
my $len =  $nodes->getLength;

# Loop through children and print their children
for($i = 0; $i < $len; $i++) {
    &printNode($doc, $nodes->item($i));
}

# Recursively go through the rest of the children
sub printNode {
    my($local_doc) = $_[0];
    my($pNode)     = $_[1];
   
    # Print data from current node
    my($name) = $pNode->getNodeName();
    print "Name: $name\n";

    my($cNodes) = $pNode->getChildNodes();
    my($clen) = $cNodes->getLength;

    for($j = 0; $j < $clen; $j++) {
        # *** Code breaks here.  Error is:
        # Can't locate auto/XML/Xerces/DOM_Text/item.al in @INC (@INC contains:
        # C:/ActiveState/lib C:/ActiveState/site/lib .) at print.pl line 39
        if($cNodes->item($j)->getNodeType() == $XML::Xerces::DOM_NODE::ELEMENT_NODE) {
            &printNode($local_doc, $cNodes->item($j));
        }
    }
}

exit(0);

Reply via email to