Hello all, We recently switched from Xerces-C2.2 to Xerces-C2.3 for some of our products' XML parsing and are having problems with the memory management features.
When running a COM object that uses Xerces-C2.3 (DOM or SAX - we use them both in different objects with the same results) on a IIS server with five clients interrogating the objects, each of the objects starts failing after a few hours with problems related to the XML parsing. Rolling back to 2.2 helped: there were no more crashes. I've been looking into the memory manager's code in the 2.3 codebase and current CVS and have looked through the discussions surrounding the memory manager and found some interesting remarks back in may. The XMemory class uses a global memory manager class thet defaults to MemoryManagerImpl. This latter is in charge of allocating and freeing memory and will return a pointer to that memory to XMemory. XMemory does some management of its own, however, "aligning" the pointer it returns by pre-pending a "header" containing a pointer to the memory manager - which is always the same because it is a global and not as configurable as one might want to believe. A few things bother me in the current setup, first among which is this alignment. The memory is aligned to sizeof(void*) or sizeof(double), whichever is greater. On MSVC (the compiler I'm using at the moment) this results in an alignment to 8 bytes. Looking through the code of the debug versions of malloc and realloc in MSVC, it looks like this couple aligns to 16 bytes - the size of a "paragraph". We will be testing with a 16-byte alignment shortly. Another thing that bothers me is that we have no choice wether or not we want to use this setup: I would personally have preferred a setup in which the memory manager were stored in a map referenced by the block, or something similar (i.e. keep the housekeeping outside of the block). This doesn't only guard you against corruption in case of buffer over- or underrun (which, as suggested in the discussion in may, is not something one should be too concerned about) but also gets rid of any alignment problem that might come up because the hypothesis that (sizeof(void*) > sizeof(double) ? sizeof(void*) : sizeof(double)) is always the right alignment. Of course, you'd have to deal with the hypothesis that pointers can be compared amongst eachother, but that particular hypothesis is made already in the Xerces codebase (most notably in DOMCasts which includes some pointer arithmatic of which the semantics are undefined). There are various non-blocking thread-safe algorithms Out There that would allow one to map against a pointer in a fast and thread-safe manner. Another thing that bothers me in this setup is the fact that one might be led to believe that running Initialize from a thread with a different memory manager will actually change the memory manager per thread. The reason for this is that the API documentation at http://xml.apache.org/xerces-c/apiDocs/classXMLPlatformUtils.html#z553_0 makes no mention of the memory manager, but the source code changes the memory manager *before* looking whether it has already been invoked. That the memory manager being changed is global and will therefore be changed from under the very nose of all the other threads is bothering to me. There is of course a little thing called thread-local data storage that could have been used in this case, but that wouldn't stick with the portability goal of the project, so I'd propose to move the assignment of the memory manager to after checking whether initialization has already occured. Note that if you don't do this, you have a race condition in XMemory, as it reads the pointer to the memory manager twice: once to call it and once to put it in the buffer. Between those two reads, the memory manager can be changed. void* XMemory::operator new(size_t size) { size_t headerSize = XMLPlatformUtils::alignPointerForNewBlockAllocation( sizeof(MemoryManager*)); // READ ONE void* const block = XMLPlatformUtils::fgMemoryManager->allocate ( headerSize + size ); // READ TWO *(MemoryManager**)block = XMLPlatformUtils::fgMemoryManager; return (char*)block + headerSize; } Of course, the documentation says it's a per-process initialisation, but I don't like inviting disaster. My questions: * will patches aiming to a. make XMemory optional at compile-time b. make XMemory not try to align the memory and always use the same memory manager c. make XMemory optionally (compile-time option) align or keep a separate managers table ... be accepted or refused beforehand (pick one of the three goals, please; I personally prefer either b or c) Rationale for b: the memory manager is per-process and should be dealt with as such. Once it is installed with the first Initialize (which needs a patch to make it so) it won't be changed. The pointer to it will henceforth only be used read-only. Rationale for c: if the memory manager is to be optional on a per-allocation basis, the alignment may severely screw up assumptions by the compiler or restrictions imposed by the architecture. In that case, a table that maps between pointers and their managers is much safer than aligning anyway Rationale for a: if b and c are unacceptable, at least leave the user a choice.. IMHO, pluggable memory management is a Good Thing, but the current implementation leaves to be desired and is currently suspected of being the cause of our crashes. It needs work. If there is a resounding "yes" to one of the three options from this list, I can talk with management to allocate time for the development. If not, we'll probably fork off an in-house XML parser. rlc -- Having the fewest wants, I am nearest to the gods. -- Socrates --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]