Hi Robert, > when you add the declaration with a dominsert, or > it is already in the text, witango seems to destroy or remove > it and disregard.
I discovered through lots of trial-and-error that some XML parsers have a habit of consuming the instructions and not returning them on output. This problem was not limited to Tango/Witango. This appears to be a behavior issue not clearly addressed in the early days of XML and DOM usage. Newer parsers honor the instructions and make them available again on output. > -----Original Message----- > From: Robert Garcia [mailto:[EMAIL PROTECTED] > Sent: Monday, September 19, 2005 11:09 PM > To: [email protected] > Subject: Re: Witango-Talk: support for @DOM encoding > > I remembered a while back, some issues I had with witango > generated xml blowing up the ingester. It happened because > the dataset from DB had tihs one word: > > moiré > > the problem was that the ingester was looking for UTF-8, and > witango was spitting out ISO-8859-1. So I found the old stuff > and tested with 5.5. > > Witango 5.5.009 on my system is definitely defaulting to an > encoding of ISO-8859-1. > > Also, it doesn't seem to allow a declaration, like Scott > suggested, when you add the declaration with a dominsert, or > it is already in the text, witango seems to destroy or remove > it and disregard. And even if you set the encoding via the > content-type header like so: > > Content-type: text/xml;charset=iso-8859-1 > > IE and safari, and probably others don't seem to understand > or ingest correctly. > > However, I use XSL transformation a lot in other languages, > and I thought maybe that would plug this gap, and it does. > > So, lets say you have a data set, I hit the database with the > funky chars in it. So my data is in local$resultset. If you > covert to dom, the xml witango looks like this, with NO > declaration, and is iso-8859-1 encoded: > > <resultSet> > > <Row id="1"> > <uid> > > <![CDATA[2436]]> > </uid> > <typeUID> > > <![CDATA[2]]> > </typeUID> > <labUID> > > <![CDATA[hhcolor]]> > </labUID> > <groupUID> > > <![CDATA[74]]> > </groupUID> > <code> > > <![CDATA[53]]> > </code> > <labCode> > > <![CDATA[53]]> > </labCode> > <name> > > <![CDATA[Soften bustline wrinkles or moir ]]> > </name> > <description> > > <![CDATA[]]> > </description> > <quantity> > > <![CDATA[]]> > </quantity> > <cost> > > <![CDATA[]]> > </cost> > <price> > > <![CDATA[]]> > </price> > <sort> > > <![CDATA[53]]> > </sort> > <aspect> > > <![CDATA[]]> > </aspect> > </Row> > > </resultSet> > > You can see the broken moiré in the result. > > Instead, I use xslt, here is the code with an example of a > xsl style sheet that make the xml nice and pretty, encodes to > utf-8, and removes cdata and encodes text nodes to html > encoding where necessary. > > This is in the results page of a search action: > > <@assign local$stylesheet '<xsl:stylesheet version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" encoding="UTF-8" indent="yes"/> > <xsl:strip-space elements="*"/> > <xsl:template match="*"> > <xsl:copy> > <xsl:copy-of select="@*" /> > <xsl:apply-templates /> > </xsl:copy> > </xsl:template> > </xsl:stylesheet>'> > <@arraytodom array=local$resultSet> > <@! output of xslt is string, so it applies xslt and outputs as text > > <@assign local$result '<@xslt local$resultSet > stylesheet="<@var local$stylesheet>">'> > <@assign local$cLen <@length "<@var local$result>">> > <@assign local$httpResponse "false"> > > > Now I assign the httpHeader in the next action: > > <@purgeresults><@assign local$httpHeader "HTTP/1.1 200 > OK<@crlf>Server: WiTango 5.5.009<@crlf>MIME-Version: > 1.0<@crlf>Content-Type: > text/xml;charset=utf-8<@crlf>Content-length: <@var > local$cLen><@CRLF><@CRLF>"> > > And last, return the outputed string with the final action > before return. > > <@var local$result encoding=none> > > > I know I don't need to specify encoding=none, but its an old habit. > > Any, this returns the xml like so: > > <?xml version="1.0" encoding="UTF-8"?> > <resultSet> > <Row id="1"> > <uid>2436</uid> > <typeUID>2</typeUID> > <labUID>hhcolor</labUID> > <groupUID>74</groupUID> > <code>53</code> > <labCode>53</labCode> > <name>Soften bustline wrinkles or moiré</name> > <description/> > <quantity/> > <cost/> > <price/> > <sort>53</sort> > <aspect/> > </Row> > </resultSet> > > Clean, with the declaration, and the special chars understood > by the ingester. I still have more testing to do, but so far > this looks like the solution. > > Another solution, is just html encode all text nodes. > > -- > > Robert Garcia > President - BigHead Technology > VP Application Development - eventpix.com > 13653 West Park Dr > Magalia, Ca 95954 > ph: 530.645.4040 x222 fax: 530.645.4040 > [EMAIL PROTECTED] - [EMAIL PROTECTED] > http://bighead.net/ - http://eventpix.com/ > > On Sep 19, 2005, at 5:44 PM, Bill Conlon wrote: > > > I think Customer Support should weigh in here. > > Do witango string manipulation functions (<@LEFT>, > <@LOCATE>, <@REGEX>, <@REPLACE>, <@RIGHT> , <@SUBSTRING>) use: > a) a fixed character encoding (e.g. LATIN-1) set at compilation > b) the character encoding of the environment > (LANG=UTF-8, for example on linux) > c) a user selectable encoding, perhaps using iconv to > do the mapping > d) it doesn't matter for single byte character sets. > Witango compares bytes. It's the developer's responsibility > to use the same encoding on both sides of a string comparison. > > If (c), how does one select the character set? > > bill > > On Monday, September 19, 2005, at 05:17 PM, Robert > Garcia wrote: > > > > The question is, will witango honor that > encoding tag, and actually encode correctly. I am rewriting > some webservices, and this was a major issue for me before, > because witango would ONLY encode as ISO-8859-1, I am writing > an application to test this, be done in a bit, and will post. > > -- > Robert Garcia > President - BigHead Technology > VP Application Development - eventpix.com > 13653 West Park Dr > Magalia, Ca 95954 > ph: 530.645.4040 x222 fax: 530.645.4040 > [EMAIL PROTECTED] - [EMAIL PROTECTED] > http://bighead.net/ - http://eventpix.com/ > > On Sep 19, 2005, at 10:50 AM, Scott Cadillac wrote: > > > > Hi Bill, > > > > > Doesn't xml require utf-8? > > > > > No. That's what the prolog and > instructions are for. Example: > > <?xml version="1.0" encoding="ISO-8859-1" ?> > <someXml /> > > There are several hundred different > character sets that can used with XML I think. If specific > instructions are not given, then typically this information > is inherited from the platform where the XML originates. > > And as we all know, Tango/Witango has a > default character set of ISO-8859-1 (on Windows anyway). > > Like I mentioned, I think one of the > newer releases of Witango provides support for UTF-8, but I > can't remember which version because I just don't spend time > in Witango anymore. > > > > > > Anyway, going to the issue of > "umlauts, etc." from last week, if the > problem is how the character is > rendered on the client, it > could be due > to the encoding specified in > the http header of the html <head>. > > > > > It depends on the consuming > application. When it comes to XML, the application may decide > to honor the XML instructions, or it may only honor the HTTP > instructions that delivered the XML. > > > > > > But I suspect there may also be > a lingering gotcha in the Witango > string manipulation tags. I > presume that Witango string > manipulation > uses the character set > specified in its environment variable (for > whatever user it's running as). > But if you want to > manipulate strings > in a different character set, > you're in trouble. > > > > > Yes, this could be a factor. If the DOM > variable is passing through some other bit of code and logic > for some additional processing, the UTF-8 encoding could get > lost and revert the characters back to ISO-8859-1. > > That is why entity encoding is often > used, e.g., €, because it is less likely to be affected > by conversion (accidental or on purpose). > > Hope that helps. > > > > > > bill > > On Monday, September 19, 2005, > at 08:07 AM, Scott Cadillac wrote: > > > > > Hi folks, > > By default Witango > supports ISO-8859-1 character sets (basic latin > characters), but newer > versions apparently support UTF-8, which is > more extensive. > > Note, I don't > remember which version > introduced the UTF-8 support. > > In theory you should be > able to just assign the encoding > > > > set when your > > > > DOM variable is > assigned, something like: > > <@ASSIGN local$myVar > value="<@dom value='<?xml version="1.0" > encoding="UTF-8" > ?><MyXml anAttribute="some characters" />'>"> > > So your success may > depend on what version of Witango you > > > > are running. > > > > > And as for encoding > your character as € as your > > > > alternative? This > > > > is standard XML > practice - get used to it. > > Have a nice day :-) > > ~ Scott Cadillac > ~ 403-254-5002 > ~ [EMAIL PROTECTED] > > ~ Custom Software for Business > http://custom.softwarefor.net > > ~ The XML-Extranet Partnership > ~ P.O. Box 69006 > RPO Bridlewood SW > Calgary, Alberta > Canada T2Y 4T9 > > > > > > -----Original > Message----- > From: > [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Monday, > September 19, 2005 8:41 AM > To: > [email protected] > Subject: Re: > Witango-Talk: support for @DOM encoding > > I have the same > problem since I have install Witango 5.5 > Server. In > prior versions it have work fine. > One way for a > work around: > <@REPLACE > STR="<@elementvalue object=user$allstringsdom > > element='root().id(Feld_@@local$Step)' encoding=none>" > FINDSTR="?" > REPLACESTR="€"> > > regards > > Daniel > > ----- > Original Message ----- > From: Mike > Scally <mailto:[EMAIL PROTECTED]> > To: > [email protected] > Sent: > Monday, September 19, 2005 2:00 PM > Subject: > Witango-Talk: support for @DOM encoding > > > Hi Folks, > > > > I wonder > would anyone be able to tell me what character > set the @DOM > tag supports? > > > > I assign > the Euro symbol (€) as part of the XML > document using > the @DOM tag, but when I read the value back > out of the XML > document it appears as a ? rather than the > Euro symbol. > This is causing me a bit of a problem and I am > wondering if > theres a way around it. Replacing the Euro > symbol with the > HTML equivalent € appears to be too > complicated in > my scenario. > > > > Thanks > > Mike. > > > > > > > > > > ******************************************************************** > > > > This > message is intended only for the use of the > person(s) ("the intended > > recipient(s)") to whom it is addressed. It may contain > information which is > privileged > and confidential within the meaning of > applicable law. If you > are not the > intended recipient, please contact the > sender as soon as > possible. > The views expressed in this communication may > not necessarily > be the > views held by LGCSB (Local Government Computer > Services Board). > > Any > attachments have been checked by a virus scanner > and appear to be > clean. > Please > ensure that you also scan all messages, as LGCSB > does not accept > any > liability for contamination or damage to your systems. > > > > > > > ******************************************************************** > > > > <M<D< > > > > ______________________________________________________________ > __________ > TO UNSUBSCRIBE: > Go to http://www.witango.com/developer/maillist.taf > > > > > > > > > > > ______________________________________________________________ > _________ > > > > _ > TO UNSUBSCRIBE: Go to > http://www.witango.com/developer/maillist.taf > > > > > > > ______________________________________________________________ > __________ > TO UNSUBSCRIBE: Go to > http://www.witango.com/developer/maillist.taf > > > > > > > > ______________________________________________________________ > __________ > TO UNSUBSCRIBE: Go to > http://www.witango.com/developer/maillist.taf > > > > > > > ______________________________________________________________ > __________ > TO UNSUBSCRIBE: Go to > http://www.witango.com/developer/maillist.taf > > > > > > ______________________________________________________________ > __________ > TO UNSUBSCRIBE: Go to > http://www.witango.com/developer/maillist.taf > > > > > > ______________________________________________________________ > __________ > TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf > ________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
