----- Original Message -----
From: Mike Pogue <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, January 28, 2000 8:33 PM
Subject: Re: xml encodings, java


> The code you have below is a clever workaround, but ultimately, you want
> to use a JVM that has the encoding support built-in.
>
> So, I'd suggest you try to use the IBM 1.1.8 JVM.  It's fairly reliable,
> scalable, and I think it has the encoding support you are looking for.
> (Of course, I am biased in this! :-)
>

OK. I just tried IBM jdk, it work exactly as blackdown in this case.

But I wont to know how must xerces (or may be this is cocoon problem,
I don't know) works with encodings. Why there is code which I comment out?
Why not to work with xml content like with raw data, only processing tags?
How must it works if I set encoding in xml document and is it input
(i.e. what I have in xml) or output (i.e. what cocoon send to browser)
encoding, etc? I want to understand how it works ! :)

Dmitry Melekhov
http://www.aspec.ru/~dm
2:5050/[EMAIL PROTECTED]

> Mike
>
>
> Dmitry Melekhov wrote:
> >
> > Hello!
> >
> > I'm not shure that tis list is write place
> > for this question. If I do mistake, I'm sorry!
> >
> > Question is Cocoon related and about how xerces must
> > works with encodings.
> >
> > I write my xml documents in koi8 encoding,
> > but set I encoding or not I always see ???? in browser instead of
> > 8 bit characters.
> > Taras Shumeyko pointed me that this is formatter problem and
> > that problem is in org.apache.xml.serialize.BaseMarkupSerializer
> > in function    protected String escape( String source )
> >
> > I changed it- remove all reecodings from it and now
> > I have Cocoon and Xerces works OK.
> > Here is my variant of function:
> >
> >   protected String escape( String source )
> >     {
> >         StringBuffer    result;
> >         int             i;
> >         char            ch;
> >         String          charRef;
> >
> >         result = new StringBuffer( source.length() );
> >         for ( i = 0 ; i < source.length() ; ++i )  {
> >             ch = source.charAt( i );
> >             // If the character is not printable, print as character
> > reference.
> >             // Non printables are below ASCII space but not tab or line
> >             // terminator, ASCII delete, or above a certain Unicode
> > threshold.
> > //          if ( ( ch < ' ' && ch != '\t' && ch != '\n' && ch != '\r' )
> > ||
> > //               ch > _lastPrintable || ch == 0xF7 )
> > //                  result.append( "&#" ).append( Integer.toString( ch )
> > ).append( ';' );
> > //          else {
> >                     // If there is a suitable entity reference for this
> >                     // character, print it. The list of available entity
> >
> >                     // references is almost but not identical between
> >                     // XML and HTML.
> > //                  charRef = getEntityRef( ch );
> > //                  if ( charRef == null )
> >                         result.append( ch );
> > //                  else
> > //                      result.append( '&' ).append( charRef ).append(
> > ';' );
> > //          }
> >         }
> >         return result.toString();
> >     }
> >
> > But this is dirty hack.
> >
> > I want to understand how must Xerces treat encodings and why
> > it don't wokrs now.
> >
> > --
> > Dmitry Melekhov
> > http://www.aspec.ru/~dm
> > 2:5050/[EMAIL PROTECTED]
> >
> > P.S.
> > My java platform is blackdown jdk 1.1.7 for Linux x86
>
>

Reply via email to