Im a python programmer, your whole approach to strings/unicode needs help.
The encoding issue you have isnt due to the library but rather coder error.
If you want to jump on IRC I can talk you through the issues.

On Sun, Jul 14, 2013 at 4:23 PM, Strainu <[email protected]> wrote:

> 2013/7/14 MZMcBride <[email protected]>:
> > Strainu wrote:
> >>I'm trying to parse the following xml (abbriged for brevity):
> >>
> >><?xml version="1.0" encoding="UTF-8"?>
> >><județ>
> >>  <siruta>47</siruta>
> >>  <nume>Județul Bacău</nume>
> >></județ>
> >>
> >>Every validator I've tried marks an error on the ț in the tag named
> >>județ.
> >
> > Hi.
> >
> > This list is a fine place to ask. :-)
>
> Hi,
>
> >
> > Are you having trouble with validation or parsing? Validators can simply
> > be wrong. Which validators are you using? And which parsers are you
> using?
>
> I'm having trouble with both. I used the W3C validator [1], which
> wasn't designed for random XML files, but can still find a good number
> of errors and xmlvalidation.com [2]. On the parsing side, I tried with
> python's lxml; the output is available at [3]
> >
> > Can you be more specific about what you're trying to do (feel free to
> link
> > to or include sample code) and the tools you're trying to do it with?
>
> Well, I have a PHP website which gathers public data about Romania's
> administrative units, which I then try to export in
> programming-friendly formats (CSV, JSON, XML). The workflow is:
> extract the data from the database, put it in a PHP array, then use
> this array to generate all the output formats. You have an example of
> such an array at [4] (since my initial email I've worked around the
> diacritics problem, but I'm still searching for a solution). For
> converting to XML I have a custom array_walk function [5].
>
> I know that some potential reusers are heavy XML fans, so I wanted to
> give them an easy way to reuse the data. Having the XML tags/JSON keys
> with diacritics is not a must have, but is definitely a very nice
> feature, because those keys could be used directly as labels when
> printing the data somewhere.
>
> Regards,
>    Strainu
>
>
>
> [1] http://validator.w3.org/
> [2] http://www.xmlvalidation.com/
> [3] https://gist.github.com/mgax/f6a3edc5b4883b3377e8
> [4]
> https://github.com/strainu/despresate/blob/master/include/sat_functions.php#L278
> [5]
> https://github.com/strainu/despresate/blob/master/include/common.php#L57
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to