Thanks David for all the replies. I am using the following code : XMLByte *s1; const XMLCh *s2 = (args[0]->str()).c_str(); unsigned int charsEaten; XMLTransService::Codes code; try{
CdxNote<<"(args[0]->str()="<<(args[0]->str())<<CdxEndl; XMLTranscoder* t = XMLPlatformUtils::fgTransService->makeNewTranscoderFor((const XMLCh*) "UTF-32", code, (unsigned int) 16*1024); t->transcodeTo(s2, (unsigned int)(args[0]->str()).length(), s1, 200, charsEaten, XMLTranscoder::UnRep_Throw); //char *s1 = XMLString::transcode(((args[0]->str()).c_str())); CdxNote<<"s1 = "<<s1<<CdxEndl; catch(XalanDOMException xExcp) { CdxNote<<"Exception caught ="<<xExcp.getExceptionCode()<<CdxEndl; } I am getting args[0]->str() already UTF-16 encoded.So,when i try to encode it to UTF-32 , i am getting a segmentation fault in the line t->transcodeTo(s2, (u....) I am giving the stack trace below: #0 0x08f45bea in FunctionPerlRegexFindMatch::execute (this=0x28d64140, [EMAIL PROTECTED], context=0x26e7a6c0, [EMAIL PROTECTED], locator=0x28d354b0) at ... xalanRegex.hpp #1 0x098c50f1 in xalanc_1_10::XPathEnvSupportDefault::extFunction () #2 0x09843db9 in xalanc_1_10::XSLTProcessorEnvSupportDefault::extFunction () #3 0x097e811e in xalanc_1_10::StylesheetExecutionContextDefault::extFunction () #4 0x098baa85 in xalanc_1_10::XPath::runExtFunction () #5 0x098bb5c2 in xalanc_1_10::XPath::executeMore () #6 0x0993fb03 in xalanc_1_10::ElemWithParam::startElement () #7 0x0993929c in xalanc_1_10::ElemTemplateElement::execute () #8 0x0981c712 in xalanc_1_10::StylesheetRoot::process () #9 0x09833bf3 in xalanc_1_10::XSLTEngineImpl::process () #10 0x0976a924 in xalanc_1_10::XalanTransformer::doTransform () #11 0x0976ac60 in xalanc_1_10::XalanTransformer::transform () What am I doing wrong here? It would be of great help if you can give me some code snippet involving the trasncodeTo() method. On Fri, Oct 10, 2008 at 11:13 PM, David Bertoni <[EMAIL PROTECTED]> wrote: > souri datta wrote: > >> Hi, >> I have an external function which passes 3 arguments (line >> ,pattern,vecor_of_matched_subexpression). >> When this 'line' contains unicode character ,the transcode method is >> throwing an exception 202(transcoding error). >> The code looks like: >> CharVectorType inputVector; >> (args[0]->str()).transcode(inputVector); >> //transcode appends one terminating NULL('\0') char which >> // is not part of the original string >> inputString.assign(inputVector.begin(),inputVector.end()-1); >> I have removed the try..catch block here. >> >> How can I convert args[0]->str() to std::string ? >> > You need to decide what encoding to use. Clearly, the local code page will > not support all of the characters you need. > > (i need this string to be passed to boost::regex_search method to search >> for pattern) >> > If boost::regex supports UTF-8, then you can transcode to UTF-8. To get a > UTF-8 transcoder, you can either use the Xalan-C function > XalanTranscodingServices::makeNewTranscoder(), or the Xerces-C function > XMLTransService::makeNewTranscoderFor(). Search the Xerces-C and Xalan-C > code bases for examples of how to use the transcoders. > > If boost::regex doesn't support UTF-8, then you will need to decide what > code page will support the characters you need, and create a transcoder for > that code page. > > The larger problem of figuring out how the data you are searching with the > regex is encoded is not something anyone can help you with. You may need to > transcode all of that data to a common encoding to make sure your regular > expressions work correctly. > > Dave >