On Thu, Oct 18, 2012 at 04:35:25PM +0200, Graham Leggett wrote:
> Hi all,
> 
> I am currently tasked with replacing the expat parser within an application 
> with the more lenient html parser found in libxml2.
> 
> I am using the parser to work out the location within the document of certain 
> elements (tags), and once I have found the element I am looking for, I need 
> to know the offset of the element from the start of the document, and the 
> length of the element. These two bits of information are provided by expat in 
> XML_GetCurrentByteIndex() and XML_GetCurrentByteCount() respectively.
> 
> I am struggling to find equivalents of these functions inside libxml2.
> 
> I can see inside the parser structures, but I cannot find a clear explanation 
> as to what the fields in those structures represent, and what kind of maths I 
> would need to do on them to derive the two bits of information I am looking 
> for.
> 
> Is there an API call that I should be using for this? Failing that, which 
> fields of the parser should I be looking at to calculate this information?

  See xmlByteConsumed() but it's more complex for us than for expat
as we convert the initial byte stream to UTF-8 if it was in a different
encoding. See the xmlByteConsumed() code. I don't understand what
"the length of the element" is supposed to mean.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
[email protected]  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to