On Sun, Dec 23, 2007 at 06:08:44PM +0100, Bjorn Reese wrote: > Now that the first working draft of EXI [1] has been released, I was > wondering how people (especially Daniel) feel about adding it to libxml. > Read the primer for a quick overview of EXI.
Hi Bjorn, I knew the question would land one the list at some point, and that I would have a hard time to answer :-) > By and large I think that EXI is a good format, where they have managed > to address many different concerns with a simple design. The major Well my main issue is to make clear that EXI IS NOT XML, as a member of W3C XML WG I have gone on record about the worries we have we EXI. They should call it EBI for something like effcicient binary interface but it's not XML, it's not markup, it's not text. I also have a huge issue with the 'pluggable codec' part: http://lists.w3.org/Archives/Public/public-xml-core-wg/2007Aug/0028.html which sounds a lot like polluting a perfectly open to all standard with the same kind of problems I'm seeing on my daily use of the Web (for example none of the Flash infected web sites work correcly on my browser, as the vendor of the propietary solution didn't dare to provide support for my 64bits platform). To be perfectly frank I really don't want the next generation of web platform implementors to jump on the easy excuse of potential restriction in edge case to control and put a toll on my use of the web, this sounds just too much of a common thinking nowadays (see also the crap about DRM). I really hope the W3C membership, or ultimately Tim Berners-Lee will block something like pluggable codecs, this simply doesn't have its place on something like a W3C specification (c.f. the motto about the full potential of the Web). Now that I have expresed my concerns about the content of the spec we can look spearately about any libxml2 implementation. I have a few more concerns there: - those are first working draft specifications, I know how long it takes to finish such spec when there is no controversy about them, for something like EXI it may take a couple of years before you get a finished version (if any), and being an early implementor usually brings you just more pain e.g XPointer where I implemented the full early spec and only a tiny, near useless fraction ended up as a REC. - who would use it ? I mean EXI target very specialized domain spaces like embedded or specific processing, would those people actually use a libxml2 version where the point is more genericity of usage and the size and portability designs of the library probably don't match the specific requirements of those use cases. An implementation just for the sake of being able to claim existence of a widely distributed early implementor doesn't sound to me a good reason to put EXI in libxml2. So now that I'm done expressing my doubts about it, let's see the technical points which an implementation in libxml2 would raise :-) > issues that I came across are: > > First, the EXI implementation should be an independent parser (and > generator) front-end. It should emmanate SAX events, so that we can > seamlessly use SAX, DOM, and xmlReader for EXI documents. Hopefully > this will also allow us to use all the other XML technologies (XPath, > XML Schema, XSLT, etc.) that libxml supports. I do not know the details > of libxml well enough to evaluate if this is indeed the case. Yes like for the HTML parser the right thing to do is to plug at the SAX level to allow a flow of event, possibly connecting to tree and reader APIs. Note also that a read-only interface is really not sufficient you want to be able to save, if you can't round-trip it's really of limited use or indication of a serious problem. > Second, EXI has some built-in datatypes that are like the XML Schema > datatypes. Obviously, some code should be reused here. Assuming the types are really compatible, yes. I just find crazy to mix layers like this, but again it's a spec concern, less of an implemtnation one ... except for the fact that if you compile Schemas support in the library size grows a lot. > Third, EXI supports a schema-informed grammar, which means that it can > use information found in XML Schemas, RELAX NG schemas, or DTDs to > create a more compact EXI document. Although the schema-informed grammar > is independent on the various schemas (XML Schema, RELAX NG, DTD), it > eventually has to be populated by those schemas, so it will create some > kind of dependency to these parts of libxml. yes. Also note that the validations parts of libxml2 and espcially the regexp/automata support is really built for validation far less for introspection, this may present a challenge (but I'm not sure). > Fourth, EXI allows (but does not mandate) the support of user-defined > CODECS for encoding and decoding contents. As this is optional, I have > not looked further into that, but obviously it should be considered if > and how this should be supported by libxml. That I have a big grief against as previously explained, it's probably too early to look at this from a technical viewpoint as I think it will take time to settle down from a standard/political one ;-) I hope I don't sound too negative, but I have a hard time to be convinced by EXI myself. On the other hand libxml2 development should be user demand driven, and to some extend my participation in the XML Core group itself is as representative of the libxml2 community. So if others could voice in it would be a good idea. Also we have IMHO plenty of time, it's not like EXI is about to become a REC, this is just a first draft with all the associated uncertainties about its content or schedule. Thanks Bjorn for raising the issue, even if this may not be a very simple one :-) Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
