> On Fri, Oct 17, 2003 at 10:11:48AM -0700, [EMAIL PROTECTED]
wrote:
> > > The XMemory class uses a global memory manager class thet defaults to
> > > MemoryManagerImpl. This latter is in charge of allocating and freeing
> > > memory and will return a pointer to that memory to XMemory. XMemory
> > > does some management of its own, however, "aligning" the pointer it
> > > returns by pre-pending a "header" containing a pointer to the memory
> > > manager - which is always the same because it is a global and not as
> > > configurable as one might want to believe.
> > Not true.  For example, you can create parser instances with their own
> > memory managers.  The global memory manager is used by default if you
don't
> > provide one, or if Xerces needs memory for global data.
> >
> > I believe the goal was to have:
> >
> >    1. a memory manager instance for data which the system considers
global
> >    and that persists for as long as the system is running.
> >
> >    2. the option to use a separate memory manager instance for other
> >    services.  For example, you can provide a memory manager to a parser
> >    instance, separating that instances memory allocations from any
global
> >    allocations.
> >
> > There was never any intention to allow for thread-specific global
memory
> > managers.
> I know that from the code, but my problem is not specifically there: I
think
> the code is misleading and the documentation is too silent on the issue.
The
> first of those two problems may be fixed with the patch I posted in bug
number
> #23930. For the second I don't have time to fix it myself, and I think
that the
> people who made the memory manager should do the documenting..

Yes, this _is_ a documentation problem, and it should be easy to fix.

> > > A few things bother me in the current setup, first among which is
this
> > > alignment. The memory is aligned to sizeof(void*) or sizeof(double),
> > > whichever is greater. On MSVC (the compiler I'm using at the moment)
> > > this results in an alignment to 8 bytes.
> > >
> > > Looking through the code of the debug versions of malloc and realloc
in
> > > MSVC, it looks like this couple aligns to 16 bytes - the size of a
> > > "paragraph". We will be testing with a 16-byte alignment shortly.
> > This has been a point of great debate, but you can switch the alignment
at
> > compile time by defining XML_PLATFORM_NEW_BLOCK_ALIGNMENT.
> Which I have set to 16 in my tests - yes. I will probably get the results
of
> those tests shortly and will report them back here - providing patches if
need
> be.
>
> > > Another thing that bothers me is that we have no choice wether or not
we
> > > want to use this setup: I would personally have preferred a setup in
which
> > > the memory manager were stored in a map referenced by the block, or
> > > something similar (i.e. keep the housekeeping outside of the block).
> > > This doesn't only guard you against corruption in case of buffer
over-
> > > or underrun (which, as suggested in the discussion in may, is not
> > > something one should be too concerned about) but also gets rid of any
> > alignment problem that might come up because the hypothesis that
> > > (sizeof(void*) > sizeof(double) ? sizeof(void*) : sizeof(double))
> > is always the right alignment.
> > This is just a bogus argument now as it was then.
> Please read both arguments before answering the first one:
> "... (which, as suggested in the discussion in may, is not something one
>  should be too concerned about) ..."
> and more importantly:
> "also gets rid of any alignment problem that might come up because the
>  hypothesis that (sizeof(void*) > sizeof(double) ? sizeof(void*) :
>  sizeof(double)) is always the right alignment."
> This was the main argument then and it is still valid AFAICS..

Yes, I was not addressing the alignment issue -- I was addressing the
notion that housekeeping information before or after the block is not
necessarily such an unusual thing.

> > Buffer over-runs and under-runs have the potential to corrupt
something,
> > regardless of whether or not you store some critical control
information
> > at the beginning of a block of memory.  Yes, in this case one might
corrupt
> > the pointer to the memory manager, but if that weren't there, it could
just
> > as easily corrupt data used by the C/C++ run-time heap memory manager,
or
> > even your application's data.  Can you really make the case that one is
any
> > better than the other?  Personally, I'd rather see my application crash
than
> have potentially important data silently corrupted.
> I agree, which is why I said you shouldn't worry about it too much. The
second
> argument is more important: playing with the alignment of pointers can
> seriously mess up the architecture's expectations and - more often - the
> compiler's expectations when optimizing: I would not be surpised at all
if
> MSVC assumes for some undocumented & dark reason that the pointers
returned by
> new are always aligned to what they call a paragraph. As the MS people
wrote
> both the new operator and the compiler, they have every reason to assume
that
> their code hasn't changed and no reason to assume that someone would
write
> something that didn't behave along the same lines. Hence my testing the
16-byte
> alignment - I'll let you know the results.

I'm sure the Xerces-C team will appreciate the research into this.

> > Using an association table for memory managers and blocks has the
potential
> > to be a performance bottleneck, because access to the table has to be
> > synchronized.
> There are many non-blocking thread-safe list and hash algorithms out
there,
> some of them are Real Fast too. The code in Xerces already relies on the
> compare-and-exchange capability introduced (on x86) in the i486. It is
(IMHO)
> right to do so and should use it more often.

Xerces-C runs on many architectures and OSes, including some that do not
have compare-and-exchange, which, by the way, on x86, still causes
synchronization, because it disables interrupts.

> There is no need to synchronize access to make a reference container
thread-
> safe.

If you want to update it, I don't see how you can guarantee updates will
not collide, but I'm willing to be educated, if you can provide a link to a
web resource.

> > > Another thing that bothers me in this setup is the fact that one
might be
> > > led to believe that running Initialize from a thread with a different
> > > memory manager will actually change the memory manager per thread.
The
> > > reason for this is that the API documentation at
> > >
http://xml.apache.org/xerces-c/apiDocs/classXMLPlatformUtils.html#z553_0
> > > makes no mention of the memory manager, but the source code changes
the
> > > memory manager *before* looking whether it has already been invoked.
That
> > > the memory manager being changed is global and will therefore be
changed
> > > from under the very nose of all the other threads is bothering to me.
> > OK, this is a really bad bug.  There is no way the global memory
manager
> > should ever change until the whole system has shut down.  It looks like
the
> > memory manager initialization should be moved to a pointer after the
> > initialization count is incremented.
> Bug #23930 contains the patch :)
>
> > However, multithreaded applications are on shifty ground here, because
the
> > initialization count is not incremented in a thread-safe manner.  This
> > lulls users into the belief that per-thread initialization is OK, and
has
> > been a long-standing criticism of mine.  I would prefer applications
that
> > need per-thread initialization handle the count themselves.
> Or increment the count in a thread-safe manner, which IMHO is not all
that
> difficult. (For most intents and purposes, a simple ++ on an integet is
> quite safe, BTW).

Actually, it's quite difficult -- it's a chicken-and-egg problem. ++ on
integers is not safe at all -- there are many architectures where this
operation can and will be interrupted.  Atomic integer increments are not
available on all of the platforms Xerces-C supports, so some sort of
synchronization primitive is needed.

> > > There is of course a little thing called thread-local data storage
that
> > > could have been used in this case, but that wouldn't stick with the
> > > portability goal of the project, so I'd propose to move the
assignment
> > > of the memory manager to after checking whether initialization has
> > > already occured.
> > Yes, thread-local storage is great, but not yet completely portable.
> >
> > > Note that if you don't do this, you have a race condition in XMemory,
as
> > > it reads the pointer to the memory manager twice: once to call it and
once
> > > to put it in the buffer. Between those two reads, the memory manager
can be
> > > changed.
> > ...
> > > Of course, the documentation says it's a per-process initialisation,
but
> > > I don't like inviting disaster.
> > Actually, there's no guarantee you will get a valid pointer at either
read,
> > of one thread is reading while another is writing.  This has to be
solved
> > by making sure the global memory manager instance never changes once
the
> > system has been initialized.  I believe that was always the intent.
> In that case, the patch in bug #23930 should do nicely :)

yup!

> BTW: a read from a pointer being written to is almost certain to yield
> something valid: either the old or the new value. The problem is that
neither
> operation is atomic because the compare-and-exchange mechanism isn't used
for
> the write operation, but even so the value will only change between the
old
> and the new value, so the first read operation should always returns
something
> valid..

"valid" is in the eye of the beholder.  If I'm expecting the consistency,
getting the old value is not valid.  At any rate, I agree this code is not
good if one expects the global memory manager can change.  But then, I
think we've agreed that it should not.

> > > My questions:
> > > * will patches aiming to
> > >   a. make XMemory optional at compile-time
> > >   b. make XMemory not try to align the memory and always use the same
> > >      memory manager
> > >   c. make XMemory optionally (compile-time option) align or keep a
> > >      separate managers table
> > "a" will like create lots of ugly ifdefs, because I think you'll find
> > XMemory is pervasive.
> Not necessarilly: I can simply #define XMemory to be either an empty base
class
> or the current XMemory class, which will result in the current XMemory
not
> being used unless we want it.

I was actually referring to the default arguments that provide an
XMemoryManager instance by referring to the global one.  It should also be
possible to no-op this class through some clever #define magic.

> > You can already get compile-time alignment through a #define, which you

> > could also define to be 0, which would give you half of "b".
> Please don't recommend something that will give me unspecified results:
> the delete operator reads at p-(headersize) to get the pointer to the
> memory manager. If (headersize == 0) it will read at p, which will
contain
> user data, which may be anything.

Actually, I meant 1, not 0 -- I replied too early in the morning!

> If you want, I can supply a patch that will make Xerces use the default
> memory manager (always) if (headersize == 0) (or use the passed-in
manager
> in the case of the second delete operator).
>

Someone on the Xerces team should address this...

> > The other half of b is an egregious bug which should be fixed ASAP.
> >
> > > If there is a resounding "yes" to one of the three options from this
> > > list, I can talk with management to allocate time for the
development.
> > > If not, we'll probably fork off an in-house XML parser.
> > I think you'll find the overhead necessary to create a fully-functional
XML
> > parser is at least several orders of magnitude greater than that of
getting
> > these issues addressed in Xerces.
> I think so too, which is why I would like to address these issues in
Xerces :)

yup...

> I propse the patch described above for (a); the patch described above for
the
> delete operator and the patch in bug #23930 to fix the first of the
problems.
> I will also report back on the tests with the 16-byte alignment and
propose a
> patch that makes the change default for Windows if it works.

Sounds good!

> In any case, I will be auditing the code to find any bugs and will report
them
> as I find them.

That's what we like to hear! ;-)

Dave


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to