Hi Robert,

> when you add the declaration with a dominsert, or 
> it is already in the text, witango seems to destroy or remove 
> it and disregard.

I discovered through lots of trial-and-error that some XML parsers have a habit 
of consuming the instructions and not returning them on output. This problem 
was not limited to Tango/Witango.

This appears to be a behavior issue not clearly addressed in the early days of 
XML and DOM usage. 

Newer parsers honor the instructions and make them available again on output.

 

> -----Original Message-----
> From: Robert Garcia [mailto:[EMAIL PROTECTED] 
> Sent: Monday, September 19, 2005 11:09 PM
> To: [email protected]
> Subject: Re: Witango-Talk: support for @DOM encoding
> 
> I remembered a while back, some issues I had with witango 
> generated xml blowing up the ingester. It happened because 
> the dataset from DB had tihs one word:
> 
> moiré
> 
> the problem was that the ingester was looking for UTF-8, and 
> witango was spitting out ISO-8859-1. So I found the old stuff 
> and tested with 5.5.
> 
> Witango 5.5.009 on my system is definitely defaulting to an 
> encoding of ISO-8859-1.
> 
> Also, it doesn't seem to allow a declaration, like Scott 
> suggested, when you add the declaration with a dominsert, or 
> it is already in the text, witango seems to destroy or remove 
> it and disregard. And even if you set the encoding via the 
> content-type header like so:
> 
> Content-type: text/xml;charset=iso-8859-1
> 
> IE and safari, and probably others don't seem to understand 
> or ingest correctly.
> 
> However, I use XSL transformation a lot in other languages, 
> and I thought maybe that would plug this gap, and it does.
> 
> So, lets say you have a data set, I hit the database with the 
> funky chars in it. So my data is in local$resultset. If you 
> covert to dom, the xml witango looks like this, with NO 
> declaration, and is iso-8859-1 encoded:
> 
> <resultSet>
> 
>   <Row id="1">
>     <uid>
>       
>       <![CDATA[2436]]>
>     </uid>
>     <typeUID>
>       
>       <![CDATA[2]]>
>     </typeUID>
>     <labUID>
>       
>       <![CDATA[hhcolor]]>
>     </labUID>
>     <groupUID>
>       
>       <![CDATA[74]]>
>     </groupUID>
>     <code>
>       
>       <![CDATA[53]]>
>     </code>
>     <labCode>
>       
>       <![CDATA[53]]>
>     </labCode>
>     <name>
>       
>       <![CDATA[Soften bustline wrinkles or moir ]]>
>     </name>
>     <description>
>       
>       <![CDATA[]]>
>     </description>
>     <quantity>
>       
>       <![CDATA[]]>
>     </quantity>
>     <cost>
>       
>       <![CDATA[]]>
>     </cost>
>     <price>
>       
>       <![CDATA[]]>
>     </price>
>     <sort>
>       
>       <![CDATA[53]]>
>     </sort>
>     <aspect>
>       
>       <![CDATA[]]>
>     </aspect>
>   </Row>
> 
> </resultSet>
> 
> You can see the broken moiré in the result.
> 
> Instead, I use xslt, here is the code with an example of a 
> xsl style sheet that make the xml nice and pretty, encodes to 
> utf-8, and removes cdata and encodes text nodes to html 
> encoding where necessary.
> 
> This is in the results page of a search action:
> 
> <@assign local$stylesheet '<xsl:stylesheet version="1.0" 
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
> <xsl:strip-space elements="*"/>
> <xsl:template match="*">
> <xsl:copy>
> <xsl:copy-of select="@*" />
> <xsl:apply-templates />
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>'>
> <@arraytodom array=local$resultSet>
> <@! output of xslt is string, so it applies xslt and outputs as text >
> <@assign local$result '<@xslt local$resultSet 
> stylesheet="<@var local$stylesheet>">'>
> <@assign local$cLen <@length "<@var local$result>">>
> <@assign local$httpResponse "false">
> 
> 
> Now I assign the httpHeader in the next action:
> 
> <@purgeresults><@assign local$httpHeader "HTTP/1.1 200 
> OK<@crlf>Server: WiTango 5.5.009<@crlf>MIME-Version: 
> 1.0<@crlf>Content-Type: 
> text/xml;charset=utf-8<@crlf>Content-length: <@var 
> local$cLen><@CRLF><@CRLF>">
> 
> And last, return the outputed string with the final action 
> before return.
> 
> <@var local$result encoding=none>
> 
> 
> I know I don't need to specify encoding=none, but its an old habit.
> 
> Any, this returns the xml like so:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <resultSet>
> <Row id="1">
> <uid>2436</uid>
> <typeUID>2</typeUID>
> <labUID>hhcolor</labUID>
> <groupUID>74</groupUID>
> <code>53</code>
> <labCode>53</labCode>
> <name>Soften bustline wrinkles or moiré</name>
> <description/>
> <quantity/>
> <cost/>
> <price/>
> <sort>53</sort>
> <aspect/>
> </Row>
> </resultSet>
> 
> Clean, with the declaration, and the special chars understood 
> by the ingester. I still have more testing to do, but so far 
> this looks like the solution.
> 
> Another solution, is just html encode all text nodes.
> 
> -- 
> 
> Robert Garcia
> President - BigHead Technology
> VP Application Development - eventpix.com
> 13653 West Park Dr
> Magalia, Ca 95954
> ph: 530.645.4040 x222 fax: 530.645.4040
> [EMAIL PROTECTED] - [EMAIL PROTECTED]
> http://bighead.net/ - http://eventpix.com/
> 
> On Sep 19, 2005, at 5:44 PM, Bill Conlon wrote:
> 
> 
>       I think Customer Support should weigh in here.
> 
>       Do witango string manipulation functions (<@LEFT>, 
> <@LOCATE>, <@REGEX>, <@REPLACE>, <@RIGHT>        , <@SUBSTRING>) use:
>       a) a fixed character encoding (e.g. LATIN-1) set at compilation
>       b) the character encoding of the environment 
> (LANG=UTF-8, for example on linux)
>       c) a user selectable encoding, perhaps using iconv to 
> do the mapping
>       d) it doesn't matter for single byte character sets.  
> Witango compares bytes.  It's the developer's responsibility 
> to use the same encoding on both sides of a string comparison.
> 
>       If (c), how does one select the character set?
> 
>       bill
> 
>       On Monday, September 19, 2005, at 05:17  PM, Robert 
> Garcia wrote:
> 
> 
> 
>               The question is, will witango honor that 
> encoding tag, and actually encode correctly. I am rewriting 
> some webservices, and this was a major issue for me before, 
> because witango would ONLY encode as ISO-8859-1, I am writing 
> an application to test this, be done in a bit, and will post.
> 
>               -- 
>               Robert Garcia
>               President - BigHead Technology
>               VP Application Development - eventpix.com
>               13653 West Park Dr
>               Magalia, Ca 95954
>               ph: 530.645.4040 x222 fax: 530.645.4040
>               [EMAIL PROTECTED] - [EMAIL PROTECTED]
>               http://bighead.net/ - http://eventpix.com/
> 
>               On Sep 19, 2005, at 10:50 AM, Scott Cadillac wrote:
> 
> 
> 
>                       Hi Bill,
> 
> 
> 
> 
>                               Doesn't xml require utf-8?
> 
> 
> 
> 
>                       No. That's what the prolog and 
> instructions are for. Example:
> 
>                       <?xml version="1.0" encoding="ISO-8859-1" ?>
>                       <someXml />
> 
>                       There are several hundred different 
> character sets that can used with XML I think. If specific 
> instructions are not given, then typically this information 
> is inherited from the platform where the XML originates.
> 
>                       And as we all know, Tango/Witango has a 
> default character set of ISO-8859-1 (on Windows anyway).
> 
>                       Like I mentioned, I think one of the 
> newer releases of Witango provides support for UTF-8, but I 
> can't remember which version because I just don't spend time 
> in Witango anymore.
> 
> 
> 
> 
> 
>                               Anyway, going to the issue of 
> "umlauts, etc." from last week, if the
>                               problem is how the character is 
> rendered on the client, it
>                               could be due
>                               to the encoding specified in 
> the http header of the html <head>.
> 
> 
> 
> 
>                       It depends on the consuming 
> application. When it comes to XML, the application may decide 
> to honor the XML instructions, or it may only honor the HTTP 
> instructions that delivered the XML.
> 
> 
> 
> 
> 
>                               But I suspect there may also be 
> a lingering gotcha in the Witango
>                               string manipulation tags.  I 
> presume that Witango string
>                               manipulation
>                               uses the character set 
> specified in its environment variable (for
>                               whatever user it's running as). 
>  But if you want to
>                               manipulate strings
>                               in a different character set, 
> you're in trouble.
> 
> 
> 
> 
>                       Yes, this could be a factor. If the DOM 
> variable is passing through some other bit of code and logic 
> for some additional processing, the UTF-8 encoding could get 
> lost and revert the characters back to ISO-8859-1.
> 
>                       That is why entity encoding is often 
> used, e.g., &euro;, because it is less likely to be affected 
> by conversion (accidental or on purpose).
> 
>                       Hope that helps.
> 
> 
> 
> 
> 
>                               bill
> 
>                               On Monday, September 19, 2005, 
> at 08:07  AM, Scott Cadillac wrote:
> 
> 
> 
> 
>                                       Hi folks,
> 
>                                       By default Witango 
> supports ISO-8859-1 character sets (basic latin
>                                       characters), but newer 
> versions apparently support UTF-8, which is
>                                       more extensive.
> 
>                                         Note, I don't 
> remember which version
>                                         introduced the UTF-8 support.
> 
>                                       In theory you should be 
> able to just assign the encoding
> 
> 
> 
>                               set when your
> 
> 
> 
>                                       DOM variable is 
> assigned, something like:
> 
>                                       <@ASSIGN local$myVar 
> value="<@dom value='<?xml version="1.0"
>                                       encoding="UTF-8" 
> ?><MyXml anAttribute="some characters" />'>">
> 
>                                       So your success may 
> depend on what version of Witango you
> 
> 
> 
>                               are running.
> 
> 
> 
> 
>                                       And as for encoding 
> your character as &euro; as your
> 
> 
> 
>                               alternative? This
> 
> 
> 
>                                       is standard XML 
> practice - get used to it.
> 
>                                       Have a nice day :-)
> 
>                                       ~ Scott Cadillac
>                                       ~ 403-254-5002
>                                       ~ [EMAIL PROTECTED]
> 
>                                       ~ Custom Software for Business
>                                         http://custom.softwarefor.net
> 
>                                       ~ The XML-Extranet Partnership
>                                       ~ P.O. Box 69006
>                                         RPO Bridlewood SW
>                                         Calgary, Alberta
>                                         Canada T2Y 4T9
> 
> 
> 
> 
> 
>                                               -----Original 
> Message-----
>                                               From: 
> [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>                                               Sent: Monday, 
> September 19, 2005 8:41 AM
>                                               To: 
> [email protected]
>                                               Subject: Re: 
> Witango-Talk: support for @DOM encoding
> 
>                                               I have the same 
> problem since I have install Witango 5.5
>                                               Server. In 
> prior versions it have work fine.
>                                               One way for a 
> work around:
>                                               <@REPLACE 
> STR="<@elementvalue object=user$allstringsdom
>                                               
> element='root().id(Feld_@@local$Step)' encoding=none>"
>                                               FINDSTR="?" 
> REPLACESTR="€">
> 
>                                               regards
> 
>                                               Daniel
> 
>                                                   ----- 
> Original Message -----
>                                                   From: Mike 
> Scally <mailto:[EMAIL PROTECTED]>
>                                                   To: 
> [email protected]
>                                                   Sent: 
> Monday, September 19, 2005 2:00 PM
>                                                   Subject: 
> Witango-Talk: support for @DOM encoding
> 
> 
>                                                   Hi Folks,
> 
> 
> 
>                                                   I wonder 
> would anyone be able to tell me what character
>                                               set the @DOM 
> tag supports?
> 
> 
> 
>                                                   I assign 
> the Euro symbol (€) as part of the XML
>                                               document using 
> the @DOM tag, but when I read the value back
>                                               out of the XML 
> document it appears as a ? rather than the
>                                               Euro symbol. 
> This is causing me a bit of a problem and I am
>                                               wondering if 
> theres a way around it. Replacing the Euro
>                                               symbol with the 
> HTML equivalent &euro; appears to be too
>                                               complicated in 
> my scenario.
> 
> 
> 
>                                                   Thanks
> 
>                                                   Mike.
> 
> 
> 
> 
> 
> 
> 
> 
>                               
> ********************************************************************
> 
> 
> 
>                                                   This 
> message is intended only for the use of the
>                                               person(s) ("the intended
>                                                   
> recipient(s)") to whom it is addressed. It may contain
>                                               information which is
>                                                   privileged 
> and confidential within the meaning of
>                                               applicable law. If you
>                                                   are not the 
> intended recipient, please contact the
>                                               sender as soon as
>                                                   possible. 
> The views expressed in this communication may
>                                               not necessarily
>                                                   be the 
> views held by LGCSB (Local Government Computer
>                                               Services Board).
> 
>                                                   Any 
> attachments have been checked by a virus scanner
>                                               and appear to be
>                                                   clean.
>                                                   Please 
> ensure that you also scan all messages, as LGCSB
>                                               does not accept
>                                                   any 
> liability for contamination or damage to your systems.
> 
> 
> 
> 
> 
>                               
> ********************************************************************
> 
> 
> 
>                                                   <M<D<
> 
> 
>                                               
> ______________________________________________________________
>                                               __________
>                                               TO UNSUBSCRIBE: 
> Go to http://www.witango.com/developer/maillist.taf
> 
> 
> 
> 
> 
> 
> 
> 
> 
>                               
> ______________________________________________________________
>                               _________
> 
> 
> 
>                                       _
>                                       TO UNSUBSCRIBE: Go to 
> http://www.witango.com/developer/maillist.taf
> 
> 
> 
> 
> 
>                               
> ______________________________________________________________
>                               __________
>                               TO UNSUBSCRIBE: Go to 
> http://www.witango.com/developer/maillist.taf
> 
> 
> 
> 
> 
> 
>                       
> ______________________________________________________________
> __________
>                       TO UNSUBSCRIBE: Go to 
> http://www.witango.com/developer/maillist.taf
> 
> 
> 
> 
> 
>               
> ______________________________________________________________
> __________
>               TO UNSUBSCRIBE: Go to 
> http://www.witango.com/developer/maillist.taf
> 
> 
> 
> 
>       
> ______________________________________________________________
> __________
>       TO UNSUBSCRIBE: Go to 
> http://www.witango.com/developer/maillist.taf
> 
> 
> 
> 
> 
> ______________________________________________________________
> __________
> TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
> 


________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf

Reply via email to