David,
Here are my comments on how to import XMLCh into 'C' language
products supported by XALAN libraries (i.e. XalanCAPI.h).
XERCES-C build creates the <xercesc/util/Xerces_autoconf_config.hpp>
include file to contain a variety of platform specific information.
This include file is normally accessed by <xercesc/util/XercesDefs.hpp>.
XERCES-C for UNIX configures Xerces_aucoconf_config.hpp to contain
#define SIZEOF_WCHAR_T sizeof(wchar_t)
#define XERCES_XMLCH_T wchar_t
typedef XERCES_XMLCH_T XMLCh;
The wchar_t may be 16-bit or 32-bit depending upon the system libraries
installed on any given platform. XERCES-C only uses wchar_t to store
UTF-16 data.
XERCES-C for Windows GNU configures Xerces_autoconf_config.hpp to contain
if wchar_t is found
#define XERCES_XMLCH_T wchar_t
else
#define XERCES_XMLCH_T unsigned short
typedef XERCES_XMLCH_T XMLCh;
XERCES-C for Visual Studio replaces Xerces_autoconf_config.msvc.hpp with
a copy of Xerces_autoconf_config.hpp which contains
#ifdef _NATIVE_WCHAR_T_DEFINED
# define XERCES_XMLCH_T wchar_t
#else
# define XERCES_XMLCH_T unsigned short
#endif
typedef XERCES_XMLCH_T XMLCh;
-------------
The international unicode support for GNU systems is referenced
using a 32-bit wchar_t data type.
Microsoft Windows uses a 16-bit wchar_t to store UTF-16 unicode
or their own MBCS multi-byte character set.
-------------
Support of XMLCh import into XalanCAPI.h
Reference the autoconfig in <xalanc/Include/PlatformDefinitions.hpp>
to resolve XMLCh when a 'C' compiler is used. The autoconfig file
has no C++ language dependencies whereas XercesDefs.hpp has C++
namespace scope declarations.
#if defined(__cplusplus)
#include <xercesc/util/XercesDefs.hpp>
#else
#include <xercesc/util/Xerces_autoconf_config.hpp>
#endif
Note: XalanCAPI.h has the following typedef that requires XMLCh.
typedef XMLCh XalanUTF16Char;
The XMLCh is capable of holding at least 16-bits of information and is
usually defined with the underlying wchar_t base type. If wchar_t is
unusable and the platform is Windows, then unsigned short may be used.
Using unsigned short may be a problem on some UNIX/GNU systems.
None of the UNIX and GNU configure routines for Xerces create C++
specific entries in the <xercesc/util/Xerces_autoconf_config.hpp> file.
The XERCES windows replacements for Borland and MSVisualC platforms
also have no C++ specific entries. It therefore appears safe to
include this file in the XALAN system for 'C' language products.
This will make sure that XMLCh is defined properly for the
supported UNIX and Windows system libraries. Note that XERCES is
based on UTF-16 internal to its libraries. The underlying wchar_t
may be 32-bit based on many UNIX systems (but not all UNIX systems).
The UNIX configure does not substitute a 16-bit data type for a
32-bit wchar_t type. Such a replacement could break the UNIX/GNU
locale libraries.
The proper place to do the include is <xalanc/Include/PlatformDefinitions.hpp>.
#if defined(__cplusplus)
#include <xercesc/util/XercesDefs.hpp>
#else
#include <xercesc/util/Xerces_autoconf_config.hpp>
#endif
Steven J. Hathaway
=== REFERENCE ===
The following defines XMLCh as a 16-bit value unless Xerces-C has defined it as
a 32-bit value for Linux using GNU glibc.
#ifndef XMLCh
#define XMLCh unsigned short
#endif
Xerces-C will _never_ define XMLCh as a 32-bit value. It is intended to
store a UTF-16 code unit, so it must be a 16-bit value.
In addition, your use of #ifndef XMLCh will not work, because XMLCh is a
typedef, not a #define.
I will need to figure out another way of getting at the definition of
XMLCh from Xerces-C
Dave