Did you try using valgrind? That might help reproduce.
Qian Ye wrote:
Hi Mahadev:
I have created a jira for this issue
https://issues.apache.org/jira/browse/ZOOKEEPER-624.
And so far, I haven't found the way to reproduce the segment fault. I tried
about 10 times the same operations and only produced the core dump 1 time.
I would attach the way to the jira if I can find.
Thx
On Tue, Dec 15, 2009 at 1:53 AM, Mahadev Konar <maha...@yahoo-inc.com>wrote:
Hi Qian,
The code that you mention still exists in the trunk and does not check for
the len before calling memcpy. Please open a jira on this.
The interesting thing though is that the len is -1. Do you have any test
case or a test scenario where it can be reproduced. It would be
interesting
to see why this is happening. We should not be getting a -1 len value from
the server.
Thanks
mahadev
On 12/14/09 6:19 AM, "Qian Ye" <yeqian....@gmail.com> wrote:
Hi guys:
I encountered a problem today that the Zookeeper C Client (version 3.2.0)
core dump when reconnected and did some operations on the zookeeper
server
which just restarted. The gdb infomation is like:
(gdb) bt
#0 0x000000302af71900 in memcpy () from /lib64/tls/libc.so.6
#1 0x000000000047bfe4 in ia_deserialize_string (ia=Variable "ia" is not
available.) at src/recordio.c:270
#2 0x000000000047ed20 in deserialize_CreateResponse (in=0x9cd870,
tag=0x50a74e "reply", v=0x409ffe70) at generated/zookeeper.jute.c:679
#3 0x000000000047a1d0 in zookeeper_process (zh=0x9c8c70, events=Variable
"events" is not available.) at src/zookeeper.c:1895
#4 0x00000000004815e6 in do_io (v=Variable "v" is not available.) at
src/mt_adaptor.c:310
#5 0x000000302b80610a in start_thread () from /lib64/tls/libpthread.so.0
#6 0x000000302afc6003 in clone () from /lib64/tls/libc.so.6
#7 0x0000000000000000 in ?? ()
(gdb) f 1
#1 0x000000000047bfe4 in ia_deserialize_string (ia=Variable "ia" is not
available.) at src/recordio.c:270
270 in src/recordio.c
(gdb) info locals
priv = (struct buff_struct *) 0x9cd8d0
*len = -1*
rc = Variable "rc" is not available.
According to the source code,
int ia_deserialize_string(struct iarchive *ia, const char *name, char
**s)
{
struct buff_struct *priv = ia->priv;
int32_t len;
*int rc = ia_deserialize_int(ia, "len", &len);*
if (rc < 0)
return rc;
if ((priv->len - priv->off) < len) {
return -E2BIG;
}
*s = malloc(len+1);
if (!*s) {
return -ENOMEM;
}
memcpy(*s, priv->buffer+priv->off, len);
(*s)[len] = '\0';
priv->off += len;
return 0;
}
the variable len is set by ia_deserialize_int, and the returned len
doesn't
been checked, so the client segment fault when trying to memcpy -1 byte
data.
I'm not sure why the client got the len variable -1 when deserialize the
response from the server, I'm also not sure whether it is an known issue.
Could any
one give me some information about this problem?