>From the desk of Steven J. Hathaway: (some markup removed for posting...)
I hope these notes provide useful insight into the XPath extended function programming in the Apache Xalan-C/C++ version 1.11. This document provides some of my working notes when building libraries of XPath extended functions for Xalan-C/C++ based applications. My primary application interface in commercial products is with the XalanCAPI.h and custom headers that reference libraries of extended XPath function installers. Using C in a C++ World The 'C' programming language does not know about 'C++' namespaces and classes. System implementations may even have different memory management architectures for the 'C' malloc/free functions virsus the 'C++' new/delete operators. The Xalan-C/C++ (XSLT) and Xerces-C/C++ (XML) also share an application specific plug-in memory management architecture. In the 'C' programming domain, I use the XalanCAPI.h as the primary interface. I create our own C-API XPath installers and uninstallers that are usable with XalanCAPI.h and XalanTransformer instances. These are declared in a 'C' language (.h) header file. When creating the C-API install and uninstall functions declare 'C' wrapper functions in the extern 'C' namespace. The body of the 'C' wrapper functions can access classes in the various C++ namespaces. #if defined(__cplusplus) extern "C" { #endif /* * MemLocal determines if XPath functions are installed into the * XalanTransformer with process-global or thread-local context. */ typedef enum {global=0 local=1} MemLocal; /* * In the XalanCAPI, the XalanHandle (void *) is actually a pointer to * an instance of the C++ XalanTransformer class. */ c-value MyInstallXpathLibrary(...) { ... Global XPath Library Installer ... Thread-local XPath Library Installer void OSPXpathInstall(XalanHandle theHandle, MemLocal insMem); } c-value MyUninstallXpathLibrary(...) { ... Global XPath Library Uninstaller ... Thread-local XPath Library Uninstaller void OSPXpathUninstall(XalanHandle theHandle, MemLocal insMem); } #if defined (__cplusplus) } #endif The above interfaces provide a C-API bridge into C++ classes and methods. The installer declarations go into a C language (.h) header file. The Install and Uninstall are 'C' functions implemented in the C++ program environnment. Therefore the declaration is in a C(.h) file and the definition is in a C++(.cpp) program file. As an example, see the Xalan-C/C++ files XalanCAPI.h and XalanCAPI.cpp. The XPath custom functions are extensions of the XALAN_CPP_NAMESPACE::Function class. I put these in a C++ language (.hpp) header file. I currently extend the XALAN_CPP_NAMESPACE with the OSP extensions library. I will probably redefine these library functions later within their own C++ namespace, but then keeping track of namespace and class inheritance becomes a significant issue. On second thought, I may just keep them here and document them well. This is a sample expanded XPath class definition from my OspXpathFunctionLib.hpp file. #include <xalanc/XPath/XPathDefinitions.hpp> #incude <cfloat> #include <xalanc/Include/XalanMemoryManagement.hpp> #include <xalan/XPath/Function.hpp> #include <xalanc/PlatformSupport/DoubleSupport.hpp> The <cfloat> imparts knowledge of the system float and double types as operational content. This provides the equivalent float and double types for the creation and conversion of XObjects with number XML content. XALAN_CPP_NAMESPACE_BEGIN class OSPXpathConvertDate : public Function { public: OSPXpathConvertDate(); virtual ~OSPXpathConvertDate(); #if !defined(XALAN_NO_USING_DECLARATION) using Function::execute; #endif virtual XObjectPtr execute( XPathExecutionContext& executionContext, XalanNode* context, const XObjectArgVectorType& args, const LocatorType* locator) const; virtual XObjectPtr execute( XPathExecutionContext& executionContext, XalanNode* context, const XObjectPtr arg1, const LocatorType* locator) const; virtual XObjectPtr execute( XPathExecutionContext& executionContext, XalanNode* context, const XObjectPtr arg1, const XObjectPtr arg2, const LocatorType* locator) const; virtual XObjectPtr execute( XPathExecutionContext& executionContext, XalanNode* context, const XObjectPtr arg1, const XObjectPtr arg2, const XObjectPtr arg3, const LocatorType* locator) const; The clone() method creates a new thread-specific instance of (This) class, using (This) as a template. Memory management is handled by "theManager" with type inherited from Xerces-C plug-in memory managment. #if !defined(XALAN_NO_COVARIANT_RETURN_TYPE) virtual Function* #else virtual OSPXpathConvertDate* #endif clone(MemoryManagerType& theManager) const; protected: virtual const XalanDOMString& getError(XalanDOMString& theResult) const; private: // Not implemented... OSPXpathConvertDate & operator==(const OSPXpathConvertDate&); bool operator==(const OSPXpathCOnvertDate&); // static const XObjectPtr s_nullXObjectPtr; }; XALAN_CPP_NAMESPACE_END I use the same format as above to declare my other extended XPath functions. Define some unicode strings for the URI namespaces and XPath function names. The namespace I am currently using "http://www.oregon.gov/OSP/CJIS/xml/xpath" as the namespace_uri for custom XPath functions. using namespace XALAN_CPP_NAMESPACE; static const XalanDOMChar s_OSPXpathNamespace[] = { U'namespace_uri' }; static const XalanDOMChar s_OSPXpathConvertDate[] = { U'cvdate' }; static const XalanDOMChar s_OSPXpathInitTableDB[] = { U'initTableDb' }; static const XalanDOMChar s_OSPXpathDropTableDB[] = { U'dropTableDb' }; static const XalanDOMChar s_OSPXpathSetTableKeyValue[] = { U'...' }; static const XalanDOMChar s_OSPXpathGetTableKeyValue[] = { U'...' }; static const XalanDOMChar s_OSPXpathGetTableRowValue[] = { U'...' }; static const XalanDOMChar s_OSPXpathSortTableKeys[] = { U'...' }; static const XalanDOMChar s_OSPXpathSortTableValues[] = { U'...' }; static const XalanDOMChar s_OSPXpathInvertTable[] = { U'...' }; static const XalanDOMChar s_OSPXpathDropTableName[] = { U'...' }; static const XalanDOMChar s_OSPXpathDropTableKey[] = { U'...' }; static const XalanDOMChar s_OSPXpathDropTableRow[] = { U'...' }; static const XalanDOMChar s_OSPXpathGetTableToXml[] = { U'...' }; Initialize the static XPath Function classes. static const OSPXpathConvertDate s_OSPXpathConvertDateFunction; static const OSPXpathInitTableDB s_OSPXpathInitTableDBFunction; static const OSPXpathDropTableDB s_OSPXpathDropTableDBFunction; static const OSPXpathGetTableRowCount s_OSPXpathGetTableRowCountFunction; static const OSPXpathSetTableKeyValue s_OSPXpathSetTableKeyValueFunction; static const OSPXpathGetTableKeyValue s_OSPXpathGetTableKeyValueFunction; static const OSPXpathGetTableKeyRow s_OSPXpathGetTableKeyRowFunction; static const OSPXpathGetTableRowKey s_OSPXpathGetTableRowKeyFunction; static const OSPXpathGetTableRowValue s_OSPXpathGetTableRowValueFunction; static const OSPXpathSortTableKeys s_OSPXpathSortTableKeysFunction; static const OSPXpathSortTableValues s_OSPXpathSortTableValues; static const OSPXpathInvertTable s_OSPXpathInvertTableFunction; static const OSPXpathDropTableName s_OSPXpathDropTableNameFunction; static const OSPXpathDropTableKey s_OSPXpathDropTableKeyFunction; static const OSPXpathDropTableRow s_OSPXpathDropTableRow; static const OSPXpathGetTableToXml s_OSPXpathGetTableToXmlFunction; XPath Extended Functions Programming The actual orcvdate(...) function to perform date string conversion is a 'C' function, not C++ method. The XPath function is a C++ class, is implemented as a function-specific class extending public: Function. The execute methods call the 'C' language orcvdate(...) to perform the actual date string conversion. Likewise, our XPath table emulation functions call a linked-list library to implement a virtual table (tablename/key/value) architecture. This annotated code sample is based on our OspXpathConvertDate.cpp file. #include "orcvdate.h" // #include "OspXpathFunctionLib.hpp" #include <xalanc/XPath/XPathEnvSupportDefault.hpp> #include <xalanc/PlatformSupport/DoubleSupport.hpp> #include <xalanc/PlatformSupport/XalanMessageLoader.hpp> #include <xalanc/XPath/XObjectFactory.hpp> // For conditional compile - xalan version dependencies // (i.e. error message reporting changes) #include <xalanc/Include/XalanVersion.hpp> OSPXpathConvertDate::OSPXpathConvertDate(): Function() { } OSPXpathConvertDate::~OSPXpathConvertDate() { } // The clone() method. #if defined(XALAN_NO_COVARIANT_RETURN_TYPE) Function* #else OSPXpathConvertDate* #endif OSPXpathConvertDate::clone(MemoryManagerType& theManager) const { return XalanCopyConstruct(theManager, *this); } // The Empty String static const XalanDOMString theEmptyString(XalanMemMgrs::getDummyMemMgr()); // The createEmptyString() method inline XObjectPtr createEmptyString(XPathExecutionContext& executionContext) { return executionContext.getXObjectFactory().createStringReference(theEmptyString); } The XPath functions always call the Format::execute() method using XObjectArgVectorType& args. The standard Xalan Format::execute() supports one, two, or three arguments. If more than three arguments are required, then the Format::execute() method must be overridden in the child class. XObjectPtr OSPXpathConvertDate::execute( XPathExecutionContext& executionContext, XalanNode* context, const XObjectArgVectorType& args, const LocatorType* locator) const { const XObjectArgVectorType::size_type theArgCount = args.size(); if (theArgCount == 0) { return execute(executionContext, context, locator); } else if (theArgCount == 1) { return execute(executionContext, context, args[0], locator); } else if (theArgCount == 2) { return execute(executionContext, context, args[0], args[1], locator); } else if (theArgCount == 3) { return execute(executionContext, context, args[0], args[1], args[2], locator); } else { XalanDOMString theBuffer(executionContext.getMemoryManager()); #if (_XALAN_VERSION == 11000) // xalanc-1_10 error handler executionContext.error(getError(theBuffer), context, locator); #endif #if (_XALAN_VERSION >= 11100) // xalanc-1_11 error handler executionContext.problem( XPathExecutionContext::eXPath, XPathExecutionContext::eError, getError(theBuffer), locator, context); #endif #if (_XALAN_VERSION < 11000) #error "XALAN Library Version must be 1_10 or newer" #endif return XObjectPtr(0); } } The OSPXpathConvertDate::execute(...) with 1 or 2 parameters calls OSPXpathConvertDate::execute(...) with three parameters with the last two parameters having defaulted values. XObjectPtr OSPXPathConvertDate::execute( XPathExecutionContext & executionContext, XalanNode *, /* context - not used */ const XObjectPtr arg1, /* unconverted source date */ const XObjectPtr arg2, /* the conversion format */ const XObjectPtr arg3, /* the +- floating year window */ const LocatorType* /* locator - not used */ ) const XObjectPtr xobjReturnPtr; { if (arg1.null() == true) { return XObjectPtr(0); // not a useful XObject type } CharVectorType theSourceDate; char * charSourceDate; CharVectorType theConvFormat; char * charConvFormat; CharVectorType theYearWindow; char * charYearWindow; int iYearWindow; MemoryManagerType & theManager = executionContext.getMemoryManager(); How to get the arguments into 'C' string format: theSourceDate = TranscodeToLocalCodePage(arg1->str()); charSourceDate = theSourceDate.begin(); if (arg2.null() == true) { charConvFormat = "default format string"; } else { theConvFormat = TranscodeToLocalCodePage(arg2->str()); charConvFormat = theConvFormat.begin(); if (strlen(charConvFormat) == 0) charConvFormat = "default format string"; } if (arg3.null() == true) { iYearWindow = 10; // (today + window) = previous century } else { theYearWindow = TranscodeToLocalCodePage(arg3->str()); charYearWindow = theYearWindow.begin(); if (strlen(charYearWindow) > 0 iYearWindow = atoi(charYearWindow); else iYearWindow = 10; } Call the 'C' language function to convert dates const char * charResult = ORcvdate( charSourceDate, // the unconverted date string strlen(charSourceDate), // length of unconverted date string charConvFormat, // the conversion format string iYearWindow); // the floating year window if (charResult == charSourceDate) { // no format conversion took place, so we just return arg1. return arg1; // the XObjects are count referenced. } Create a string structure from the 'C' string returned from ORcvdate(). XalanDOMString theResult(charResult); Create the return object for the XPath interpreter xobjReturnPtr = executionContext.getXObjectFactory().createString(theResult); Free the charResult if necessary if (charResult && (charResult != charSourceDate)) free(charResult); The allocation for xobjReturnPtr is inherited by the method that executes the Function::execute() method. return xobjReturnPtr; } The following method supplies a specific error if the function cannot be executed. This example uses the XalanMessageLoader which creates an XalanDOMString. You could instead return your own XalanDOMString without using the message loader. const XalanDOMString& OSPXpathConvertDate::getError(XalanDOMString& theResult) const { return XalanMessageLoader::getMessage( theResult, XalanMessages::FunctionTakesTwoOrThreeArguments_1Param, "local:cvdate(anydate,fmt,ywin)"); } XALAN_CPP_NAMESPACE_END End of annotated example for an XPath function that calls a 'C' language library routine. Error Reporting Interface The executionContext.error() was the interface to the ProblemListener as documented in the sample XPath functions in the Xalan 1.10 release. The interface has changed to use the more complete executionContext.problem() method. The sample code provided above shows code that works with both versions of Xalan-C/C++ (1.10 and 1.11) error reporting. Xalan/Xerces Plug-in Memory Management The sample code above works with Xerces plug-in memory management. I also show how to convert node content into 'C' strings for use with 'C' language library functions. Most of the memory management issues resolve around which context has ownership of objects virsus pointers to objects owned by other contexts. The ownership and transfers must be well coordinated. The Apache memory management libraries also have constructs known as janitors which are used with shared string pools, allowing objects to have multiple-ownership (incremental reference counts) such that objects remain allocated if the reference count is greater than zero. This provides some simple garbage collection in a C++ world. Memory management issues can arise if care is not taken to isolate Java memory management, Microsoft CRT memory management, Xerces-C plug-in memory management, and Microsoft framework memory management. Microsoft programs use the Microsoft framework and what they call "managed" code for garbage collection. Microsoft assemblies are executable program modules that work within the Microsoft framework. The Apache libraries do not use the Microsoft framework and its assemblies. In the Microsoft world, the Apache libraries must be treated as "unmanaged" code because they are not designed to work within the Microsoft framework with its managed garbage-collection heap. Apache libraries can be referenced by Microsoft assemblies using Microsoft data marshals. Static Apache libraries cannot be linked with Microsoft assemblies. Apache DLLs can be referenced through the Microsoft data marshals from within Microsoft assemblies. Using C in a Java World I try to avoid the Java Native Interface (JNI) for C. But when integrating complex systems, where components are designed by separate enterprieses, the JNI glue becomes necessary. Each version of SUN Java has its own implementation of JNI, but the interface definitions are reasonably stable. For each new Java upgrade, you may need to relink your 'C' language application with the new JNI interface library. Java and CPP with XSLT My experience with Java and C++ with used with XSLT architecures seem divergent in the ease of extending the XSLT implementations. C++ with XALAN can readily add new XPath functions for stylesheets. Java appears better at creating new XSLT extension elements for stylesheets. The W3 recommendations specify the markup and XML namespace requirements for adding custom XPath functions and custom XSLT elements. --------------------------------------------------------------------- To unsubscribe, e-mail: xalan-dev-unsubscr...@xml.apache.org For additional commands, e-mail: xalan-dev-h...@xml.apache.org