Daniel Veillard wrote:
> On Tue, Dec 04, 2007 at 06:19:53PM +0100, Roland Mainz wrote:
> > I am currently working on SAX/xmlSAXParseFile libxml2 bindings for
> > ksh93/kash and have a few questions about the API:
> > - Is there a way to provide a "default encoding setting" which should be
> > used if the document itself doesn't define a character set ?
> 
>   I don't understand the question. The XML standard defines how things
> should be checked in the absence of informations, in the document. If you
> provide that information it overrides the normal procxessing (and if you
> guess wrong you get a fatal error).
> More doc
>   http://xmlsoft.org/encoding.html
>   http://www.w3.org/TR/REC-xml/#sec-guessing

Erm... the issue is a bit POSIX shell specific. A POSIX shell always
operates on "characters boundaries in the current locale". If we would
allow something like
$ echo "<?xml version="1.0..." | xmlsaxparse myfunctions - # then the
input data will be in the current locale. Either we assume that only
ASCII data are passed to the SAX parser, the shell script code needs to
lookup the current locale's encoding and pass it with the XML document
or the "xmlsaxparse" shell builtin command needs to handle the situation
somehow (e.g. convert input data to the expencted encoding). 

> > - How can I turn-off the libxml2 feature that it resolves all entities
> > (e.g. how can I do my own entity resolving) ?
> 
>   By default in SAX mode if you don't ask for entity replacement I
> think libxml let you provide it (see the entity callback). NOTE:
> this is hairy, complex to get right in the general case, and one reason
> I recomment to not use SAX at all.

What would you recomment to be used instead ?

> > - How can I abort a SAX parser run from within a callback function ?
> 
>   xmlStopParser()

Thanks! :-)

> > - Is there a way to get |xmlSAXParseFile()| to accept stdin as input to
> > allow it's use in pipe chains ?
> 
>   "-"

Somehow it seems it doesn't like the pipe input much... the libxml2 code
sometimes prints stuff like:
-- snip --
I/O error : Invalid seek
I/O error : Invalid seek
I/O error : Invalid seek
I/O error : Invalid seek
-- snip --

> > - Is |xmlSAXParseFile()| fully thread-safe (important since ksh93 will
> > get thread support soon) ?
> 
> doc here
>   http://xmlsoft.org/threads.html

Thanks! :-)

> > - Is |xmlSAXParseFile()| re-entrant, e.g. can I call |xmlSAXParseFile()|
> > from within a callback during another |xmlSAXParseFile()| ? For example
> 
>   That should work the parsing contexts are different

Ok...

> > for a simple RSS browser I would have to call |xmlSAXParseFile()| to
> > decode the RSS stream and then |xmlSAXParseFile()| a 2nd time (from
> > within a callback) to decode the XHTML data (I've already tried but
> > something weired is going on somehow the '<' and '>' characters seem to
> > "disappear" from the charatcer data stream).
> 
>   No idea, sounds weird.

Yes... even more weired the error sometimes disappears and then comes
lack - sounds like a job for "dbx -check access" or "valgrind" ...

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [EMAIL PROTECTED]
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to