Andy Clark wrote: > Without direct hooks into the decoder used to transcode the > source bytes into the Unicode characters that the parser scans, > the parser cannot know the absolute byte location in the stream. > And, as I said before, writing a stream that only reads one > character at a time might work but is very inefficient.
i know this and i see all inherent difficulties to do it now when Xerces 2 is close to finishing (maybe in Xerces 3 ;-)) as it seems that nobody raised that particular problem before (?!) and as i think common case for me is a lot of small soap messages i should get good enough performance allowing parser to decode input into UTF16 chars and then manually encode it back in something like UTF8 when forwarding the message or if necessary i can always hack one or two simple decoders such as US-ASCII, ISO-LATIN-X and UTF8 in future... > The code that you submitted does not provide the application > with a mapping from the absolute byte position that corresponds > to handler callbacks. It only provides a mapping from the char > position within the buffer holding the transcoded characters. you are right it is pointer into transcoded current entity and as i am wrapping input with my own reader i can retrieve any parts of transcended original input (but not of original input stream). that seems like very reasonable first step though :-) > If this is what you want to accomplish, that's fine but we would i would really like to see it added - the patch is very very simple and has no effect on performance (just one += operation in load(...) call). > still need a feature that tells the scanner not to re-use the > input buffers (something we do for performance). i was thinking that it could be accomplished with adding new XNI feature such as buffer-reuse and set it to false (?!) however i am not still how all those System.arraycopy(fCurrentEntity.ch, offset,tmp, 0, length); will be affected... thanks, alek --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
