[EMAIL PROTECTED] wrote:
>
> I have glibc-2.1.2 also. That is the version I have seen mentioned in
> connection with this. RH 7.0 has a later glibc, does it not?
> I guess I should start looking for a patch path forward from 2.1.2.
I think I've found the problem. It seems to be a bug in glibc 2.1.2.
If a dlopen() call fails, it sets an internal variable to point to a
string describing the error (for later retrieval with dlerror()). If
you then call dlopen() again, without calling dlerror() in between to
clear the error status, dlopen() will free the string. Then it tries to
load the library, and sets the internal variable again depending on
whether the open succeeded or failed. However, during the time that its
trying to load the library, it keeps this internal variable pointing to
a now-freed block of memory. As we call dlopen() again before the
loading of the library finishes, it tries to free the same memory twice.
Here is a bit of the code from glibc 2.1.2:
---
if (result->errstring != NULL)
/* Free the error string from the last failed command. This can
happen if `dlerror' was not run after an error was found. */
free (result->errstring);
result->errcode = _dl_catch_error (&result->errstring, operate, args);
---
_dl_catch_error ends up eventually recursively invoking this same
function, and, result->errstring hasn't been reset yet. The code from
glibc 2.2 looks like:
---
if (result->errstring != NULL)
{
/* Free the error string from the last failed command. This can
happen if `dlerror' was not run after an error was found. */
if (strcmp (result->errstring, "out of memory") != 0)
free ((char *) result->errstring);
result->errstring = NULL;
}
result->errcode = _dl_catch_error (&result->objname, &result->errstring,
operate, args);
---
Where they seemed to have addressed this issue.
Short of forcing everyone to upgrade to glibc 2.2, we can work around
this issue in Wine. We can make sure that we call dlerror() after every
(failed) dlopen() call, even if we throw away the return. This will
clear the error state, and everything should work well. Another option
is to simply call dlerror() immediately before any dlopen() call to make
sure that the error state is clear before going in. The first option
seems cleaner. However, its complicated by the fact that there are
dlopen() calls in multiple places, and sometimes the callers assume that
dlerror() will return meaningful results after the failure of a
wine-internal routine that calls dlopen(). We have ELFDLL_dlopen() and
wine_dll_load() where callers can use dlerror() after return. There is
also wine_dll_load_main_exe() where callers apparently don't call
dlerror(). And then there is a dlopen() call off by its lonesome in
dlls/odbc32/proxyodbc.c which doesn't call dlerror(), and which seems to
be missing the HAVE_DL_API protection (unless this isn't compiled at all
in that case?).
Is there any planned direction for this code? For instance, is the
stuff in loader/elfdll.c going to go away now? Perhaps if we could
isolate all the calls to dlopen() in one place, we could use a better
error reporting mechanism that using dlerror(). Doing it the way we do
now requires HAVE_DL_API checks around to see if we can safely call
dlerror(). And its prone to problems as putting dlerror() calls in
lower level routines can break routines that are higher up. We could
save the return from dlerror() and then give it out to anyone who asked
for it. Of course, this would have to be made thread safe, which is a
pain. Or we could make the functions that call dlerror() take an extra
parameter, char **error, and put if (error) *error = dlerror(); inside.
Thoughts?
(BTW, inserting a dlerror() call right before
if (!ret) ret = dlopen( buffer + dll_path_maxlen + 1, RTLD_NOW );
in library/loader.c hacks around the problem for me for now).
--
James Abbatiello