Thanks David for all the replies.
I am using the following code :
         XMLByte  *s1;
          const XMLCh *s2 = (args[0]->str()).c_str();
          unsigned int charsEaten;
          XMLTransService::Codes code;
          try{

            CdxNote<<"(args[0]->str()="<<(args[0]->str())<<CdxEndl;
            XMLTranscoder* t =
XMLPlatformUtils::fgTransService->makeNewTranscoderFor((const XMLCh*)
"UTF-32",  code, (unsigned int) 16*1024);
            t->transcodeTo(s2, (unsigned int)(args[0]->str()).length(), s1,
200, charsEaten, XMLTranscoder::UnRep_Throw);
            //char *s1 = XMLString::transcode(((args[0]->str()).c_str()));
            CdxNote<<"s1 = "<<s1<<CdxEndl;
catch(XalanDOMException xExcp)
          {
            CdxNote<<"Exception caught
="<<xExcp.getExceptionCode()<<CdxEndl;
          }

I am getting args[0]->str() already UTF-16 encoded.So,when i try to encode
it to UTF-32 , i am getting a segmentation fault in the line
t->transcodeTo(s2, (u....)
I am giving the stack trace below:
#0  0x08f45bea in FunctionPerlRegexFindMatch::execute (this=0x28d64140,
[EMAIL PROTECTED],
    context=0x26e7a6c0, [EMAIL PROTECTED], locator=0x28d354b0)
at ... xalanRegex.hpp
#1  0x098c50f1 in xalanc_1_10::XPathEnvSupportDefault::extFunction ()
#2  0x09843db9 in xalanc_1_10::XSLTProcessorEnvSupportDefault::extFunction
()
#3  0x097e811e in
xalanc_1_10::StylesheetExecutionContextDefault::extFunction ()
#4  0x098baa85 in xalanc_1_10::XPath::runExtFunction ()
#5  0x098bb5c2 in xalanc_1_10::XPath::executeMore ()
#6  0x0993fb03 in xalanc_1_10::ElemWithParam::startElement ()
#7  0x0993929c in xalanc_1_10::ElemTemplateElement::execute ()
#8  0x0981c712 in xalanc_1_10::StylesheetRoot::process ()
#9  0x09833bf3 in xalanc_1_10::XSLTEngineImpl::process ()
#10 0x0976a924 in xalanc_1_10::XalanTransformer::doTransform ()
#11 0x0976ac60 in xalanc_1_10::XalanTransformer::transform ()

What am I doing wrong here?
It would be of great help if you can give me some code snippet involving the
trasncodeTo() method.




On Fri, Oct 10, 2008 at 11:13 PM, David Bertoni <[EMAIL PROTECTED]> wrote:

> souri datta wrote:
>
>> Hi,
>> I have an external function which passes 3 arguments (line
>> ,pattern,vecor_of_matched_subexpression).
>> When this 'line' contains unicode character ,the transcode method is
>> throwing an exception 202(transcoding error).
>> The code looks like:
>> CharVectorType inputVector;
>> (args[0]->str()).transcode(inputVector);
>>  //transcode appends one terminating NULL('\0') char which
>>  // is not part of the original string
>>  inputString.assign(inputVector.begin(),inputVector.end()-1);
>> I have removed the try..catch block here.
>>
>> How can I convert args[0]->str()  to std::string ?
>>
> You need to decide what encoding to use.  Clearly, the local code page will
> not support all of the characters you need.
>
>  (i need this string to be passed to boost::regex_search method to search
>> for pattern)
>>
> If boost::regex supports UTF-8, then you can transcode to UTF-8.  To get a
> UTF-8 transcoder, you can either use the Xalan-C function
> XalanTranscodingServices::makeNewTranscoder(), or the Xerces-C function
> XMLTransService::makeNewTranscoderFor().  Search the Xerces-C and Xalan-C
> code bases for examples of how to use the transcoders.
>
> If boost::regex doesn't support UTF-8, then you will need to decide what
> code page will support the characters you need, and create a transcoder for
> that code page.
>
> The larger problem of figuring out how the data you are searching with the
> regex is encoded is not something anyone can help you with.  You may need to
> transcode all of that data to a common encoding to make sure your regular
> expressions work correctly.
>
> Dave
>

Reply via email to