At 15.36 24/04/2002 -0400, you wrote: >Right. And it's also caveat emptor that getSrcOffset isn't necessarily >supported the transcoder.
..and, if I rememeber right, MemBufInputSource and LocalFileInputSource return a different location (one returns the location of the character of the next token, the other the location of the last character of the recognized token) Alberto >BTW, what is the transcoder system that ICU stands for? >-ted > > >----- Original Message ----- >From: "Dean Roddey" <[EMAIL PROTECTED]> >To: <[EMAIL PROTECTED]> >Sent: Wednesday, April 24, 2002 3:01 PM >Subject: Re: how to access the raw text that generated a sax event > > > > Keep in mind that its getting you the offset just past the *end* of the > > thing it just handed you. So you have to keep a previous offset around in > > order to get the data from there to the new end minus one byte, which will > > get you the content of the thing it just handed you. > > > > -------------------------- > > Dean Roddey > > The Charmed Quark Controller > > Charmed Quark Software > > [EMAIL PROTECTED] > > http://www.charmedquark.com > > > > "If it don't have a control port, don't buy it!" > > > > > > ----- Original Message ----- > > From: "ted sandler" <[EMAIL PROTECTED]> > > To: <[EMAIL PROTECTED]> > > Sent: Wednesday, April 24, 2002 11:56 AM > > Subject: Re: how to access the raw text that generated a sax event > > > > > > > Hi Jason, > > > > > > Would it be possible for me to get a copy of your SWIG files so I access > > the > > > scanner's functionality from within Perl? > > > > > > Thanks so much. > > > -ted > > > > > > ----- Original Message ----- > > > From: "Jason E. Stewart" <[EMAIL PROTECTED]> > > > To: <[EMAIL PROTECTED]> > > > Sent: Wednesday, April 24, 2002 11:06 AM > > > Subject: Re: how to access the raw text that generated a sax event > > > > > > > > > > "Dean Roddey" <[EMAIL PROTECTED]> writes: > > > > > > > > > The getSrcOffset() method of XMLScanner should return you the > > > information > > > > > you want. However, it can only do that if the source offset stuff is > > > > > supported by the transcoding system being used. For ICU and the > > internal > > > > > transcoders that is true. I just looked and in the latest repository > > > files, > > > > > the Win32 and ICU transcoders are supporting this functionality. > > > > > > > > > > So if you get the scanner, and call getSrcOffset() it should return > > you > > > the > > > > > position where it stopped transcoding the element it just passed to > > you. > > > > > This should be in terms of the raw content buffer it is parsing >from, > > > i.e. > > > > > pre-transcoded input. If its not returning the correct info, then > > > perhaps it > > > > > has become broken over time since hardly anyone every uses it. But >it > > > used > > > > > to work because we had to make it so for an internal IBM customer at > > the > > > > > time. > > > > > > > > Hey Dean, > > > > > > > > Thanks for the info! That was exactly what I wanted. > > > > > > > > Phew! After making the XMLScanner available to Perl I can now access > > > > getSrcOffset(): > > > > > > > > print Found element contributors at 53 offset > > > > print Found element person at 78 offset > > > > print Found element name at 87 offset > > > > print Found element email at 114 offset > > > > print Found element person at 177 offset > > > > print Found element name at 186 offset > > > > print Found element email at 213 offset > > > > print Found element person at 280 offset > > > > print Found element name at 289 offset > > > > print Found element email at 323 offset > > > > > > > > So it seems to be working. > > > > > > > > Thanks again for your insight into the internals, > > > > jas. > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
