Nice example, it's true. But my data in the XML files shouldn't be like this.
My definition of "useless whitespaces" is "leading and trailing whitespace".
I think I will keep the XSLT solution.

----- Mail original -----
De: "Georges-André SILBER" <gasil...@luxia.fr>
À: "spam spam spam spam" <spam.spam.spam.s...@free.fr>
Cc: xml@gnome.org
Envoyé: Jeudi 16 Février 2012 10:02:23
Objet: Re: [xml] Remove whitespaces from text nodes

OK, but in this case it really depends on your input XML format and what you 
consider "useless".

If you only have "locally" useless whitespaces like here:

<txt>     Some text </txt>

and you want to get <txt>Some text</txt> you can still use the function below 
and "strip" every text node with a C function (I don't think that a standard C 
function exists for that).

But, if you have mixed content like:

<p>  Some text<b><i> in bold italics </i></b>and continuing</p>

it is tricky to define "useless" whitespaces in a recursive descent because the 
decision is not local (here, clearly, the whitespaces in the <i> elements 
should not be removed).

So, depending on your definition of "useless", the difficulty of the answer can 
range from very simple to very complicated.

Best regards,

Georges-André SILBER
LUXIA

Le 16 févr. 2012 à 09:32, spam.spam.spam.s...@free.fr a écrit :

> I think this function removes blank nodes.
> That's not exactly what I want.
> I want to strip useless whitespaces from text nodes.
> These nodes aren't considered as blank nodes because they contains also 
> visible characters.
> 
> ----- Mail original -----
> De: "Georges-André SILBER" <gasil...@luxia.fr>
> À: "spam spam spam spam" <spam.spam.spam.s...@free.fr>
> Cc: xml@gnome.org
> Envoyé: Jeudi 16 Février 2012 09:14:17
> Objet: Re: [xml] Remove whitespaces from text nodes
> 
> Hi,
> 
> I wrote a small function for this purpose some time ago.
> I didn't test it with the last versions of libxml2 nor did I ensure that this 
> code if correct but it gives you the idea of a method that you can use to 
> remove blank nodes.
> 
> The usage is for instance:
> 
> doc = xmlReadFile (xmlfile, NULL, 0);
> if (doc == NULL)
>   {
>       /* Deal with error... */
>      return 1;
>    }
> glbRemoveBlankNodes (xmlDocGetRootElement(doc));
> 
> Hope this helps,
> 
> Best regards,
> 
> Georges-André SILBER
> LUXIA
> 
> int
> glbRemoveBlankNodes (xmlNodePtr n)
> {
>  xmlNodePtr cur;
>  xmlNodePtr next;
> 
>  if (n == NULL)
>    return 0;
> 
>  cur = n->children;
>  while (cur)
>    {
>      next = cur->next;      
>      if (xmlIsBlankNode (cur))
>       {
>         xmlUnlinkNode (cur);
>         xmlFreeNode (cur);
>       }
>      else
>       glbRemoveBlankNodes (cur);
>      cur = next;
>    }
> 
>  return 0;
> }
> 
> 
> Le 16 févr. 2012 à 08:57, spam.spam.spam.s...@free.fr a écrit :
> 
>> Yes you are right.
>> But I am not sure my function will do a good job.
>> I know 2 whitespaces : " ", "\t", ... But I am not sure that I know all of 
>> them.
>> My function will probably forgot to strip some whitespaces...
>> This is the reason why I would like to use an already defined function.
>> 
>> Is there a function which do this job?
>> 
>> ----- Mail original -----
>> De: "Liam R E Quin" <l...@holoweb.net>
>> À: "spam spam spam spam" <spam.spam.spam.s...@free.fr>
>> Cc: xml@gnome.org
>> Envoyé: Jeudi 16 Février 2012 08:40:31
>> Objet: Re: [xml] Remove whitespaces from text nodes
>> 
>> On Thu, 2012-02-16 at 08:28 +0100, spam.spam.spam.s...@free.fr wrote:
>>> [...].
>>> Anyway, there seems to have no other solution with libxml2 only.
>> 
>> The spaces are part of the text of the document, so it's not likely that
>> a conformant XML parser will strip them for you.
>> 
>> You could of course remove the spaces in C after parsing, just as if you
>> decided to remove every occurrence of an upper-case "B" from the input.
>> 
>> That's just standard C string processing.
>> 
>> -- 
>> Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
>> Pictures from old books: http://fromoldbooks.org/
>> 
>> _______________________________________________
>> xml mailing list, project page  http://xmlsoft.org/
>> xml@gnome.org
>> http://mail.gnome.org/mailman/listinfo/xml
> 
> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> http://mail.gnome.org/mailman/listinfo/xml

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to