First, the disclaimers:  I'm not an XML maintainer or even a contributor; and, I've only given this a cursory glance.

Here are my reactions.

First, the routine in question is declared to be of module static scope.  I believe that this means that any exploitation of it would have to be indirect, coming from within LibXML2.

Second, the code appears to properly protect against buffer overflow, so the only exploitation that I see is the one that you illustrated, leaking memory position information.  However, the function in question does nothing that its caller could not do -- it calls malloc() and (in the example you provided) returns the address of what might be a stack location (although, really, it could be any random junk, I think).  By definition, the caller already knows the stack location, and the caller certainly has its own access to malloc().

So, I for one, don't see any weakness exposed by the code which you provided.  This doesn't mean that there is no weakness in LibXML2, but, to find it if it's there, you'll have to interrogate the code which calls this routine.


On 10/29/19 9:30 AM, Raphael de Carvalho Muniz wrote:
 Dear libxml2 owners,

I am performing research about weaknesses in C open source programs. As part of my research, I am studying weaknesses that may be vulnerabilities in the Libxml2 project.

I found in the commit history of Libxml2 (commit 9acef28) the presence of the following code snippet in the libxml.c file (Lines 1,597 - 1,612). I believe that this commit presents a weakness that, If format strings can be influenced by an attacker, they can be exploited. This weakness is characterized by CWE Project as CWE-134: Use of Externally-Controlled Format String. When an attacker can modify an externally-controlled format string, this can lead to buffer overflows, denial of service, or data representation problems.

Moreover, I performed a software testing strategy with respect to confirm the vulnerability. We provide as input to the char *msg the value "%xtest" and the function libxml_buildMessage return the value"fc0c748ex", exposing a memory position.

This is the GitHub link to the commit:


Code snippet:
static char *
libxml_buildMessage(const char *msg, va_list ap){
  int chars;
  char *str;
  str = (char *) xmlMalloc(1000);
  if (str == NULL)
    return NULL;

  chars = vsnprintf(str, 999, msg, ap);
  if (chars >= 998)
    str[999] = 0;
  return str;

Looking at this code snippet, I am wondering if you could answer the following brief questions:
We understand that this code has a weakness. Do you agree?
How do you detect weaknesses? Do you use any tool to detect them?
We would be very grateful if you say to us if you agree, and if you are going to fix it.
Raphael de Carvalho Muniz, M.Sc.
Lattes: http://lattes.cnpq.br/1454914002384966
e-Mail: raphaeld...@gmail.com <mailto:raphaeld...@gmail.com> / raph...@copin.ufcg.edu.br <mailto:raph...@copin.ufcg.edu.br>
Fone: +55 84 98801 1218

xml mailing list, project page  http://xmlsoft.org/


Webb Scales
Principal Software Architect
www.ursasecure.com <https://www.ursasecure.com>
w...@ursasecure.com <mailto:w...@ursasecure.com>

xml mailing list, project page  http://xmlsoft.org/

Reply via email to