> The XMemory class uses a global memory manager class thet defaults to
> MemoryManagerImpl. This latter is in charge of allocating and freeing
memory
> and will return a pointer to that memory to XMemory. XMemory does some
> management of its own, however, "aligning" the pointer it returns by
> pre-pending a "header" containing a pointer to the memory manager - which
is
> always the same because it is a global and not as configurable as one
might
> want to believe.

Not true.  For example, you can create parser instances with their own
memory managers.  The global memory manager is used by default if you don't
provide one, or if Xerces needs memory for global data.

I believe the goal was to have:

   1. a memory manager instance for data which the system considers global
   and that persists for as long as the system is running.

   2. the option to use a separate memory manager instance for other
   services.  For example, you can provide a memory manager to a parser
   instance, separating that instances memory allocations from any global
   allocations.

There was never any intention to allow for thread-specific global memory
managers.

> A few things bother me in the current setup, first among which is this
> alignment. The memory is aligned to sizeof(void*) or sizeof(double),
whichever
> is greater. On MSVC (the compiler I'm using at the moment) this results
in
> an alignment to 8 bytes.
>
> Looking through the code of the debug versions of malloc and realloc in
MSVC,
> it looks like this couple aligns to 16 bytes - the size of a "paragraph".
We
> will be testing with a 16-byte alignment shortly.

This has been a point of great debate, but you can switch the alignment at
compile time by defining XML_PLATFORM_NEW_BLOCK_ALIGNMENT.

> Another thing that bothers me is that we have no choice wether or not we
want
> to use this setup: I would personally have preferred a setup in which the
> memory manager were stored in a map referenced by the block, or something
> similar (i.e. keep the housekeeping outside of the block). This doesn't
only
> guard you against corruption in case of buffer over- or underrun (which,
as
> suggested in the discussion in may, is not something one should be too
> concerned about) but also gets rid of any alignment problem that might
come up
> because the hypothesis that (sizeof(void*) > sizeof(double) ?
sizeof(void*) :
> sizeof(double)) is always the right alignment.

This is just a bogus argument now as it was then.  Buffer over-runs and
under-runs have the potential to corrupt something, regardless of whether
or not you store some critical control information at the beginning of a
block of memory.  Yes, in this case one might corrupt the pointer to the
memory manager, but if that weren't there, it could just as easily corrupt
data used by the C/C++ run-time heap memory manager, or even your
application's data.  Can you really make the case that one is any better
than the other?  Personally, I'd rather see my application crash than have
potentially important data silently corrupted.

Using an association table for memory managers and blocks has the potential
to be a performance bottleneck, because access to the table has to be
synchronized.

> Another thing that bothers me in this setup is the fact that one might be
led
> to believe that running Initialize from a thread with a different memory
> manager will actually change the memory manager per thread. The reason
for
> this is that the API documentation at
>
http://xml.apache.org/xerces-c/apiDocs/classXMLPlatformUtils.html#z553_0
> makes no mention of the memory manager, but the source code changes the
> memory manager *before* looking whether it has already been invoked. That
the
> memory manager being changed is global and will therefore be changed from
under
> the very nose of all the other threads is bothering to me.

OK, this is a really bad bug.  There is no way the global memory manager
should ever change until the whole system has shut down.  It looks like the
memory manager initialization should be moved to a pointer after the
initialization count is incremented.

However, multithreaded applications are on shifty ground here, because the
initialization count is not incremented in a thread-safe manner.  This
lulls users into the belief that per-thread initialization is OK, and has
been a long-standing criticism of mine.  I would prefer applications that
need per-thread initialization handle the count themselves.

> There is of course a little thing called thread-local data storage that
could
> have been used in this case, but that wouldn't stick with the portability

> goal of the project, so I'd propose to move the assignment of the memory
> manager to after checking whether initialization has already occured.

Yes, thread-local storage is great, but not yet completely portable.

> Note that if you don't do this, you have a race condition in XMemory, as
it
> reads the pointer to the memory manager twice: once to call it and once
to
> put it in the buffer. Between those two reads, the memory manager can be
> changed.
...
> Of course, the documentation says it's a per-process initialisation, but
I
> don't like inviting disaster.

Actually, there's no guarantee you will get a valid pointer at either read,
of one thread is reading while another is writing.  This has to be solved
by making sure the global memory manager instance never changes once the
system has been initialized.  I believe that was always the intent.

> My questions:
> * will patches aiming to
>   a. make XMemory optional at compile-time
>   b. make XMemory not try to align the memory and always use the same
memory
>      manager
>   c. make XMemory optionally (compile-time option) align or keep a
separate
>      managers table

"a" will like create lots of ugly ifdefs, because I think you'll find
XMemory is pervasive.  You can already get compile-time alignment through a
#define, which you could also define to be 0, which would give you half of
"b".  The other half of b is an egregious bug which should be fixed ASAP.

> If there is a resounding "yes" to one of the three options from this
list, I
> can talk with management to allocate time for the development. If not,
we'll
> probably fork off an in-house XML parser.

I think you'll find the overhead necessary to create a fully-functional XML
parser is at least several orders of magnitude greater than that of getting
these issues addressed in Xerces.

Thanks!

Dave


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to