Hello!
I'm not shure that tis list is write place
for this question. If I do mistake, I'm sorry!
Question is Cocoon related and about how xerces must
works with encodings.
I write my xml documents in koi8 encoding,
but set I encoding or not I always see ???? in browser instead of
8 bit characters.
Taras Shumeyko pointed me that this is formatter problem and
that problem is in org.apache.xml.serialize.BaseMarkupSerializer
in function protected String escape( String source )
I changed it- remove all reecodings from it and now
I have Cocoon and Xerces works OK.
Here is my variant of function:
protected String escape( String source )
{
StringBuffer result;
int i;
char ch;
String charRef;
result = new StringBuffer( source.length() );
for ( i = 0 ; i < source.length() ; ++i ) {
ch = source.charAt( i );
// If the character is not printable, print as character
reference.
// Non printables are below ASCII space but not tab or line
// terminator, ASCII delete, or above a certain Unicode
threshold.
// if ( ( ch < ' ' && ch != '\t' && ch != '\n' && ch != '\r' )
||
// ch > _lastPrintable || ch == 0xF7 )
// result.append( "&#" ).append( Integer.toString( ch )
).append( ';' );
// else {
// If there is a suitable entity reference for this
// character, print it. The list of available entity
// references is almost but not identical between
// XML and HTML.
// charRef = getEntityRef( ch );
// if ( charRef == null )
result.append( ch );
// else
// result.append( '&' ).append( charRef ).append(
';' );
// }
}
return result.toString();
}
But this is dirty hack.
I want to understand how must Xerces treat encodings and why
it don't wokrs now.
--
Dmitry Melekhov
http://www.aspec.ru/~dm
2:5050/[EMAIL PROTECTED]
P.S.
My java platform is blackdown jdk 1.1.7 for Linux x86