Stefan, congratulations. This is definitely useful. Please talk a bit about the API, and how it differs/varies from cElementTree, or link to some examples. For example, the node nesting, the usage of a 'tail' for trailing text. I wonder if lxml offers more of a DOM compliant node nesting, or if it conforms to the conventions/oddities of ElemenTree. Also show us how it differs from BeautifulSoup, which has extremely robust unicode handling and mangled XML/HTML tag completion, but may benchmark a bit slower. Thanks again, and good job! Gloria
> Hi all, > > I'm proudly announcing the first alpha release of lxml 2.0. > > http://codespeak.net/lxml/dev/ > http://pypi.python.org/pypi/lxml/2.0alpha1 > > ** What is lxml? > > """ > In short: lxml is the most feature-rich and easy-to-use library for working > with XML and HTML in the Python language. > > lxml is a Pythonic binding for the libxml2 and libxslt libraries. It is unique > in that it combines the speed and feature completeness of these libraries with > the simplicity of a native Python API. > """ > > This release features a major cleanup both behind the scenes and at the > surface, that improves the XML tool integration and makes the API clearer and > more consistent in many places. The major new addition, however, is the > lxml.html package, a new toolkit for HTML handling. > > The web site for the pre-2.0 series is online at > > http://codespeak.net/lxml/dev/ > > The "what's new" page has a description of the major changes: > > http://codespeak.net/lxml/dev/lxml2.html > > and the ChangeLog has a more detailed list, see below. > > This being an alpha release means that not everything is stable, both in terms > of crashes and the API. There will be a small number of alpha releases to make > the advancements publicly available, before the beta releases focus on > improving the stability. > > > I warmly invite everyone to contribute to the final release by discussing the > API changes and the new features on the mailing list. There is always space > for improvements! > > > There is currently a known problem with Microsoft's compilers, so Windows > builds may not become available for 2.0alpha1. The next alpha will hopefully > come with prebuilt binaries for that platform. Building with the more > standards compliant MinGW compilers should work. > > Note that working on the code now requires Cython (version 0.9.6.5), an > enhanced fork of Pyrex. lxml therefore no longer ships with a copy of Pyrex > or Cython, but as usual, building from the distribution sources does not > require Cython. It can be installed with "easy_install Cython" or from here: > > http://www.cython.org/ > > I hope that lxml 2.0 will become a straight continuation of the success story > that lxml 1.x was already. > > Have fun, > Stefan > > > 2.0alpha1 (2007-09-02) > Features added > > * Reimplemented objectify.E for better performance and improved > integration with objectify. Provides extended type support based on > registered PyTypes. > * XSLT objects now support deep copying > * New makeSubElement() C-API function that allows creating a new > subelement straight with text, tail and attributes. > * XPath extension functions can now access the current context node > (context.context_node) and use a context dictionary > (context.eval_context) from the context provided in their first > parameter > * HTML tag soup parser based on BeautifulSoup in lxml.html.ElementSoup > * New module lxml.doctestcompare by Ian Bicking for writing simplified > doctests based on XML/HTML output. Use by importing lxml.usedoctest or > lxml.html.usedoctest from within a doctest. > * New module lxml.cssselect by Ian Bicking for selecting Elements with > CSS selectors. > * New package lxml.html written by Ian Bicking for advanced HTML > treatment. > * Namespace class setup is now local to the ElementNamespaceClassLookup > instance and no longer global. > * Schematron validation (incomplete in libxml2) > * Additional stringify argument to objectify.PyType() takes a conversion > function to strings to support setting text values from arbitrary types. > * Entity support through an Entity factory and element classes. XML > parsers now have a resolve_entities keyword argument that can be set to > False to keep entities in the document. > * column field on error log entries to accompany the line field > * Error specific messages in XPath parsing and evaluation > NOTE: for evaluation errors, you will now get an XPathEvalError instead > of an XPathSyntaxError. To catch both, you can except on XPathError. > * The regular expression functions in XPath now support passing a node-set > instead of a string > * Extended type annotation in objectify: new xsiannotate() function > * EXSLT RegExp support in standard XPath (not only XSLT) > > Bugs fixed > > * lxml.etree did not check tag/attribute names > * The XML parser did not report undefined entities as error > * The text in exceptions raised by XML parsers, validators and XPath > evaluators now reports the first error that occurred instead of the last > * Passing '' as XPath namespace prefix did not raise an error > * Thread safety in XPath evaluators > > Other changes > > * objectify.PyType for None is now called "NoneType" > * el.getiterator() renamed to el.iter(), following ElementTree 1.3 - > original name is still available as alias > * In the public C-API, findOrBuildNodeNs() was replaced by the more > generic findOrBuildNodeNsPrefix > * Major refactoring in XPath/XSLT extension function code > * Network access in parsers disabled by default > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > > _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig