Author: jasons Date: Mon Nov 10 10:12:02 2003 New Revision: 110 Modified: xml/xerces-p/trunk/docs/readme.xml Log: up to date
Modified: xml/xerces-p/trunk/docs/readme.xml ============================================================================== --- xml/xerces-p/trunk/docs/readme.xml (original) +++ xml/xerces-p/trunk/docs/readme.xml Mon Nov 10 10:12:02 2003 @@ -1,51 +1,361 @@ <?xml version="1.0" standalone="no"?> -<!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> - -<s1 title="&XercesPFullName;"> - - <s2 title="&XercesPFullName; version &XercesPVersion;"> - -<p>&XercesPName; delivers the benefits of the &XercesCName; DOM Parser in Perl5. -&XercesPName; includes a collection of Perl5 wrapper objects that internally use -their &XercesCName; counterparts for high-performance, scalable and localizable -XML DOM parsing.</p> -</s2> - - <s2 title="Applications of &XercesPName;"> - -<p>&XercesPName; has rich generating and validating capabilities. The parser is used for:</p> - -<ul> - <li>Building XML-savvy Web servers</li> - <li>Building next generation of vertical applications that use XML as - their data format</li> - <li>On-the-fly validation for creating XML editors</li> - <li>Ensuring the integrity of e-business data expressed in XML</li> - <li>Building truly internationalized XML applications</li> -</ul> - </s2> - - <s2 title="Features"> -<ul> -<li>Programmatic generation, validation of XML</li> -<li>Conforms to DOM Level 1 Spec</li> -<li>High performance</li> -<li>Customizable error handling</li> -<li>Source code, samples, and docs provided</li> -<li>All the underlying benefits of &XercesCName;</li> -</ul> - </s2> +<!DOCTYPE s1 SYSTEM "dtd/document.dtd" [ +<!ENTITY % VERSION SYSTEM "entities.ent"> +%VERSION; +]> +<s1 title="Xerces Perl: The Perl API to the Apache Xerces XML parser"> + <s2 title="Current Release: &XERCES_P_NAME; &XERCES_P_VERSION;"> + <p> +&XERCES_P_NAME; is the Perl API to the Apache project's Xerces XML +parser. It is implemented using the Xerces C++ API, and it provides +access to <em>most</em> of the C++ API from Perl. + </p> + <p> +Because it is based on &XERCES_C_NAME;, &XERCES_P_NAME; provides a +validating XML parser that makes it easy to give your application the +ability to read and write XML data. Classes are provided for parsing, +generating, manipulating, and validating XML documents. &XERCES_P_NAME; +is faithful to the XML 1.0 recommendation and associated standards +(DOM levels 1,2, and 3, SAX 1 and 2, Namespaces, and W3C XML +Schema). The parser provides high performance, modularity, and +scalability, and provides full support for Unicode. + </p> + <p> +&XERCES_P_NAME; implements the vast majority of the Xerces-C API (if +you notice any discrepancies please mail the <jump href="mailto:&XERCES_P_LIST;"> +list</jump>). The exception is some functions in the C++ API which +either have better Perl counterparts (such as file I/O) or which +manipulate internal C++ information that has no role in the Perl +module. + </p> + <p> +The majority of the API is created automatically using +<jump href="http://www.swig.org/">Simplified Wrapper Interface +Generator (SWIG)</jump>. However, care has been taken to make most +method invocations natural to perl programmers, so a number of rough +C++ edges have been smoothed over (See the <link +anchor='perl-api'>Special Perl API Features</link> section). + </p> + </s2> + <s2 title="Available Platforms"> + <p> +The code has been tested on the following platforms: + </p> + <ul> + <li>Linux</li> + <li>Cygwin</li> + <li>Windows</li> + <li>Mac OS X</li> + <li>BSD</li> + <li>Solaris </li> + <li>AIX </li> + <li>Tru64 </li> + </ul> + </s2> + <s2 title="Build Requirements"> + <s3 title="ANSI C++ compiler"> + <p>Builds are known to work with the GNU C compiler, and other platform + specific compilers (such as VC++ on Windows and Forte on + Solaris). Contributions in this area are always welcome :-). + </p> + </s3> + <s3 title="Perl5"> + <note>Required version: 5.6.0</note> + <p>&XERCES_P_NAME; now supports Unicode. Since Unicode support wasn't + added to Perl until 5.6.0, you will need to upgrade in order to use this + and future versions of &XERCES_P_NAME;. Upgrading to at least to the + latest stable release, 5.6.1, is recommended. + </p> + <p>If you plan on using Unicode, I <em>strongly</em> recommend upgrading + to Perl-5.8.x, the latest stable version. There have been significant + improvements to Perl's Unicode support. + </p> + </s3> + <s3 title="The Apache Xerces C++ XML Parser"> + <note>Required version: &XERCES_C_VERSION;</note> + <p>(which can be downloaded from <jump href="http://www.apache.org/dist/xml/xerces-c/"> + the apache archive</jump>) You'll need both the library and header files, + and to set up any environment variables that will direct the + &XERCES_P_NAME; build to the directories where these reside. + </p> + </s3> + </s2> + <s2 title="Prepare for the build"> + <s3 title="Download &XERCES_P_NAME;"> + <p>Download the release and it's digital signature, from<jump href="http://xml.apache.org/dist/xerces-p/stable"> + the apache Xerces-C archive</jump>. + </p> + </s3> + <s3 title="Verify the archive"> + <p>Optionally verify the release using the supplied digital signature (see + <jump href="http://xml.apache.org/xerces-p/download.html">the apache + Xerces-Perl archive</jump> for details) + </p> + </s3> + <s3 title="Unpack the archive"> + <p>Unpack the archive in a directory of your choice. Example + (for UNIX): + </p> + <ul> + <li><code>tar zxvf XML-Xerces-&XERCES_P_VERSION;.tar.gz</code></li> + <li><code>cd XML-Xerces-&XERCES_P_VERSION;</code></li> + </ul> + </s3> + <s3 title="Getting &XERCES_C_NAME;"> + <p>If the Xerces-C library and header files are installed on your system + directly, e.g. via an rpm or deb package, proceed to the directions for + building &XERCES_P_NAME;. + </p> + <p>Otherwise, you must download &XERCES_C_NAME; from www.apache.org. If + there is a binary available for your architecture, you may use it, + otherwise you must build it from source. If you wish to make + &XERCES_C_NAME; available to other applications, you may install it + however it is not necessary to do so in order to build &XERCES_P_NAME;. + To build &XERCES_P_NAME; from an uninstalled &XERCES_C_NAME; set the + XERCESCROOT environment variable the top-level directory of the source + directory (i.e. the same value it needs to be to build &XERCES_C_NAME;): + </p> + <source><![CDATA[ + export XERCESCROOT=/home/jasons/xerces-2.3.0/ + ]]></source> - <s2 title="Supported Platforms"> + <p>If you choose to install &XERCES_C_NAME; on your system, you + need to set the XERCES_INCLUDE and XERCES_LIB environment variables: + </p> + <source><![CDATA[ + export XERCES_INCLUDE=/usr/include/xerces + export XERCES_LIB=/usr/lib + ]]></source> + </s3> + </s2> + <s2 title="Build &XERCES_P_NAME;"> + <ol> + <li>Go to the XML-Xerces-&XERCES_P_VERSION; directory.</li> + <li>Build &XERCES_P_NAME; as you would any perl package that you + might get from CPAN:</li> + <ul> + <li><code>perl Makefile.PL</code></li> + <li><code>make</code></li> + <li><code>make test</code></li> + <li><code>make install</code></li> + </ul> + </ol> + </s2> + <s2 title="Using &XERCES_P_NAME;"> + <p>&XERCES_P_NAME; implements the vast majority of the Xerces-C API (if you + notice any discrepancies please mail the list). Documentation for this API + are sadly not available in POD format, but the Xerces-C html documentation + is available <jump href="http://xml.apache.org/xerces-c/apiDocs/index.html">online</jump>. + </p> + <p>For more information, see the examples in the samples/ directory. + and the test scripts located in the t/ directory. + </p> + </s2> + <s2 title="Special Perl API Features"> + <p>Even though &XERCES_P_NAME; is based on the C++ API, it has been modified + in a few ways to make it more accessible to typical Perl usage, primarily in + the handling: + </p> + <p><anchor name="perl-api"/></p> <ul> - <li>Win32 (MSVC 6.0 compiler)</li> + <li><link anchor="string">String I/O</link> (Perl strings versus XMLch arrays)</li> + <li><link anchor="list">List I/O</link> (Perl lists versus DOM_NodeList's)</li> + <li><link anchor="hash">Hash I/O</link> (Perl hashes versus DOM_NamedNodeMap's)</li> + <li><link anchor="list-hash-io">Combined List/Hash classes</link></li> + <li><link anchor="serialize">DOM Serialization API</link></li> + <li><link anchor="handlers">Implementing Perl handlers for C++ event callbacks</link></li> + <li><link anchor="exceptions">handling C++ exceptions</link></li> + <li><link anchor="unicode-constants">XML::Xerces::XMLUni unicode constants</link></li> </ul> - </s2> + <p><anchor name="string"/></p> + <s3 title="String I/O"> + <p>Any functions in the C++ API that return <code>XMLCh</code> arrays will + return plain vanilla perl-strings in &XERCES_P_NAME;. This obviates calls + to <code>transcode</code> (in fact, it makes them entirely invalid). + </p> + </s3> + <p><anchor name="list"/></p> + <s3 title="List I/O"> + <p>Any function that in the C++ API returns a <code>DOMNodeList</code> + (e.g. <code>getChildNodes()</code> and <code>getElementsByTagName()</code> + for example) will return different types if they are called in a list + context or a scalar context. In a scalar context, these functions return a + reference to a <code>XML::Xerces::DOMNodeList</code>, just like in C++ + API. However, in a list context they will return a Perl list of + <code>XML::Xerces::DOM_Node</code> references. For example: + </p> + <source><![CDATA[ + # returns a reference to a XML::Xerces::DOMNodeList + my $node_list_ref = $doc->getElementsByTagName('foo'); + + # returns a list of XML::Xerces::DOMNode's + my @node_list = $doc->getElementsByTagName('foo'); + ]]></source> + </s3> + <p><anchor name="hash"/></p> + <s3 title="Hash I/O"> + <p>Any function that in the C++ API returns a + <code>DOMNamedNodeMap</code> (<code>getEntities()</code> and + <code>getAttributes()</code> for example) will return different types if + they are called in a list context or a scalar context. In a scalar + context, these functions return a reference to a + <code>XML::Xerces::DOMNamedNodeMap</code>, just like in C++ API. However, + in a list context they will return a Perl hash. For example: + </p> + <source><![CDATA[ + # returns a reference to a XML::Xerces::DOMNamedNodeMap + my $attr_map_ref = $element_node->getAttributes(); + + # returns a hash of the attributes + my %attrs = $element_node->getAttributes(); + ]]></source> + </s3> + <p><anchor name="list-hash-io"/></p> + <s3 title="Combined List/Hash classes (XMLAttDefList)"> + <p>Any function that in the C++ API returns a XMLAttDefList + (getAttDefList() for SchemaElementDecl and DTDElementDecl), will + always return an instance of XML::Xerces::XMLAttDefList. However, + there are two Perl specific API methods that can be invoked on the + object: to_list() and to_hash(). + </p> + <source><![CDATA[ + # get the XML::Xerces::XMLAttDefList. + my $attr_list = $element_decl->getAttDefList(); - <s2 title="Platforms Coming Soon"> + # return a list of XML::Xerces::XMLAttDef instances + my @list = $attr_list->to_list(); + + # returns a hash of the attributes, where the keys are the + # result of calling getFullName() on the attributes, and the + # values are the XML::Xerces::XMLAttDef instances. + my %attrs = $attr_list->to_hash(); + ]]></source> + </s3> + <p><anchor name="serialize"/></p> + <s3 title="Serialize API"> + <p>The DOMWriter class is used for serializing DOM hierarchies. See + t/DOMWriter.t or <link idref="domprint"> samples/DOMPrint.pl</link> + for details. + </p> + <p>For less complex usage, just use the serialize() method defined for all + DOMNode subclasses. + </p> + </s3> + <p><anchor name="handlers"/></p> + <s3 title="Implementing {Document,Content,Error}Handlers from Perl"> + <p>Thanks to suggestions from Duncan Cameron, &XERCES_P_NAME; now has a + handler API that matches the currently used semantics of other Perl XML + API's. There are three classes available for application writers: + </p> + <ul> + <li>PerlErrorHandler (SAX 1/2 and DOM 1)</li> + <li>PerlDocumentHandler (SAX 1)</li> + <li>PerlContentHandler (SAX 2)</li> + </ul> + <p>Using these classes is as simple as creating a perl subclass of the + needed class, and redefining any needed methods. For example, to override + the default fatal_error() method of the PerlErrorHandler class we can + include this piece of code within our application: + </p> + <source><![CDATA[ + package MyErrorHandler; + @ISA = qw(XML::Xerces::PerlErrorHandler); + sub fatal_error {die "Oops, I got an error\n";} + + package main; + my $dom = new XML::Xerces::DOMParser; + $dom->setErrorHandler(MyErrorHandler->new()); + ]]></source> + </s3> + <p><anchor name="exceptions"/></p> + <s3 title="Handling exceptions ({XML,DOM,SAX}Exception's)"> + <p>Some errors occur outside parsing and are not caught by the parser's + ErrorHandler. &XERCES_P_NAME; provides a way for catching these errors + using the PerlExceptionHandler class. Usually the following code + is enough for catching exceptions: + </p> + <source><![CDATA[ + eval{$parser->parser($my_file)}; + XML::Xerces::error($@) if $@; + ]]></source> + <p>Wrap any code that might throw an exception inside an eval{...} and + call XML::Xerces::error() passing $@, if $@ is set. + </p> + <p>There are a default methods that prints out an error message and calls + die(), but if more is needed, see the files t/XMLException.t, + t/SAXException.t, and t/DOMException.t for details on how to roll your own + handler. + </p> + </s3> + <p><anchor name="unicode-constants"/></p> + <s3 title="XML::Xerces::XMLUni unicode constants"> + <p>XML::Xerces uses many constant values for setting of features, and + properties, such as for XML::Xerces::SAX2XMLReader::setFeature(). You can + hard code the strings or integers into your programs but this will make + them vulnerable to an API change. Instead, use the constants defined in + the XML::Xerces::XMLUni class. If the API changes, the constants will be + updated to reflect that change. See the file docs/UMLUni.txt for a + complete listing of the constant names and their values. + </p> + </s3> + </s2> + <s2 title="Sample Code"> + <p>&XERCES_P_NAME; comes with a number of sample applications: + </p> <ul> - <li>Linux (RedHat 6.0)</li> + <li><link idref="saxcount">SAXCount.pl</link>: Uses the SAX interface to + output a count of the number of elements in an XML document</li> + <li><link idref="sax2count">SAX2Count.pl</link>: Uses the SAX2 interface + to output a count of the number of elements in an XML document</li> + <li><link idref="domcount">DOMCount.pl</link>: Uses the DOM interface to + output a count of the number of elements in an XML document</li> + <li><link idref="domprint">DOMPrint.pl</link>: Uses the DOM interface to + output a pretty-printed version of an XML file to STDOUT </li> + <li><link idref="domcreate">DOMCreate.pl</link>: Creates a simple XML + document using the DOM interface and writes it to STDOUT</li> + <li><link idref="dom2hash">DOM2hash.pl</link>: Uses the DOM interface to + convert the file to a simple hash of lists representation</li> + <li><link idref="enumval">EnumVal.pl</link>: Parses and input XML document + and outputs the DTD information to STDOUT</li> + <li><link idref="senumval">SEnumVal.pl</link>: Parses and input XML document + and outputs the XML Schema information to STDOUT</li> </ul> - </s2> - -</s1> \ No newline at end of file + </s2> + <s2 title="Development Tools"> + <note>These are only for internal &XERCES_P_NAME; development. If + your intention is solely to use &XERCES_P_NAME; to write XML + applications in Perl, you will <em>NOT</em> need these + tools.</note> + <s3 title="SWIG"> + <p> +<jump href="http://www.swig.org/"> +Simplified Wrapper Interface Generator (SWIG)</jump> is an open source +tool by David Beazley of the University of Chicago for automatically +generating Perl wrappers for C and C++ libraries (i.e. *.a or *.so for +UNIX, *.dll for Windoes). You can get the source from <jump href="http://www.swig.org/"> +the SWIG home page</jump> and then build it for your platform. + </p> + <p> +You will only need this if the include Xerces.C and &XERCES_P_NAME; +files do not work for your perl distribution. The pre-generated files +have been created by SWIG 1.3 and work under Perl-5.6 or later. + </p> + <p> +This port will only work with SWIG 1.3.20 (which is currently only +available via CVS). + </p> + <p> +If your planning to use SWIG, you can set the environment variable +SWIG to the full path to the SWIG executable before running <code>perl +Makefile.pl</code>. For example: + </p> + <source><![CDATA[ + export SWIG=/usr/bin/swig + ]]></source> + <p> +This is only necessary if it isn't in your path or you have more than +one version installed. + </p> + </s3> + </s2> +</s1> --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]