As expected, SUE (Stupid User Error). It was due to a structure alignment issue. The memcpy(&pReply->findReply.service,pServiceEntry,pServiceEntry->serviceLen) statement, The "service" address was aligned to an extra 4 bytes. The part of the message in front of that, was only 12 bytes long. Thus instead of being +12 from the beginning of the message, it resolved to +16, thus overlaying 4 bytes at the end.
I apologize for the noise on the list. ty On Dec 31, 2012, at 4:04 PM, Brad Taylor <[email protected]> wrote: > I am having a problem with what looks like a double free in zmq. I assume it > is something I am doing wrong, however I have looked fairly carefully and can > not find the problem. I will document this from the results back to the > source code. > > During execution I get: > > ==============> error during executing > <============================================== > > *** glibc detected *** > /home/brad/mavenSDK/lz_serviceRegistry/Debug/lz_serviceRegistry:double free > or corruption (!prev): 0x00000000006096e0 *** > ======= Backtrace: ========= > /lib64/libc.so.6[0x36b287c80e] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x180b3)[0x7ffff7bcd0b3] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2f9c3)[0x7ffff7be49c3] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2fe7b)[0x7ffff7be4e7b] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2637e)[0x7ffff7bdb37e] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x21512)[0x7ffff7bd6512] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x15194)[0x7ffff7bca194] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x1429e)[0x7ffff7bc929e] > /home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2aba6)[0x7ffff7bdfba6] > > ================>Environment<====================== > > The service registry is a "daemon" that listens for beacons from service > providers, Records those services. It also listens for requests that "find" > to find the endpoint for those services. > > It is during this request processing, while constructing the and sending the > reply that the problem occurs. The following two routines are where the > problem occurs. The second routine is the driving routine, it receives > control when a "find" message is received in an upper polling loop. > > The sendResponse routine is invoked, because the requested service was > found. The only thing odd I can think of that I am doing, is the subroutine > is the one that does the zmq_msg_init_size(), and upon return the > zmq_msg_send and zmq_msg_close are done. > > I have numbered the sequence the statements are actually executed in. Any > thoughts on how to debug this problem would be appreciated > > ============>Source code<============== > > > typedef struct MSGREPLY { > zmq_msg_t *pPriorMessage; > zmq_msg_t sPriorMessage; > }MSGREPLY; > > /** > * This routine will send a response to the find message. This is a > "delayed" response because > * we do not know if there will be any "more" responses to be sent. > * We utilize the multi-part message facility of zmq to do this. Each part > of the message > * is a service that matches the requestors specification, that is for a find > request > * there may be multiple service providers for the requested service. For > list request multiple services may exist. > */ > static int sendResponse(void * context, void *pReplySocket, > LZSERVICE *pServiceEntry, > MSGREPLY *pMsgReply, > LZSERVICEMSGS msgID) { > > int msgSize; > LZSERVICEREPLY *pReply; > > ====>8 if (pMsgReply->pPriorMessage) { > > zmq_msg_send(pMsgReply->pPriorMessage,pReplySocket,ZMQ_SNDMORE); > zmq_msg_close(pMsgReply->pPriorMessage); > } > ====>9 pMsgReply->pPriorMessage = &pMsgReply->sPriorMessage; > ====>10 msgSize = sizeof(LZSERVICEREPLYHDR)+pServiceEntry->serviceLen; > > ====>11 zmq_msg_init_size (pMsgReply->pPriorMessage, msgSize); > ====>12 pReply = zmq_msg_data(pMsgReply->pPriorMessage); > ====>13 LZSERVICE_SET_REPLY(pReply); > ====>14 pReply->replyHdr.rc = 0; > ====>15 pReply->replyHdr.msghdr.msgID = msgID; > ====>16 > memcpy(&pReply->findReply.service,pServiceEntry,pServiceEntry->serviceLen); > ====>17 pReply->findReply.service.pServiceName = NULL; > ====>18 pReply->findReply.service.pServiceURI = NULL; > > ====>19 return 0; > } > > /** > * This routine will find the requested service > * This is a "request" message sent on a request socket. That is it requires > a response > * Since there may be multiple instances of a given service running on > multiple blades or multiple processes on a blade > * duplicate service name entries may exist. We return all of the instances > that match the requested name > * When there is more than one, we send subsequent service responses in > separate message frames > */ > static int processFindService(void * context, LZSERVICEMESSAGE *pFindMessage, > void *pReplySocket) { > > int msgSize; > LZSERVICEREPLY *pFindReply; > LZSERVICE *pServiceEntry; > char *pServiceName; > GList *iterator = NULL; > MSGREPLY sReply; > zmq_msg_t sReplyMessage; > char routineName[] = > "lzServiceRegistry:processFindService"; > > ===>1 pServiceName = pFindMessage->findService.aServiceName; > ===>2 sReply.pPriorMessage = NULL; > > > /*--------------------------------------------------------------------------+ > * Look for matching service names in the global service table > | > * for each found entry, construct a response and send it > | > > *-------------------------------------------------------------------------*/ > ====>3 for (iterator = pGlobalServices; iterator; iterator = > iterator->next) { > ====>4 pServiceEntry = iterator->data; > ====>5 if (pServiceEntry->serviceNameLen == > pFindMessage->findService.serviceNameLen) { > ====>6 if > (!memcmp(pServiceEntry->pServiceName,pServiceName,pServiceEntry->serviceNameLen)) > { > ====>7 > sendResponse(context,pReplySocket,pServiceEntry,&sReply, > LZSERVICEMSGS_FINDSERVICE); > if (options.debugMode) { > fprintf(stderr,"%s: service %s was > found\n",routineName,pServiceName); > } > } > } > } > ====>20 if (sReply.pPriorMessage) { > ====>21 zmq_msg_send(sReply.pPriorMessage,pReplySocket,0); > =============This is where I get the exception =================== > zmq_msg_close(sReply.pPriorMessage); > } > else { > msgSize = sizeof(LZSERVICEREPLY); > zmq_msg_init_size (&sReplyMessage, msgSize); > pFindReply = zmq_msg_data(&sReplyMessage); > LZSERVICE_SET_REPLY(pFindReply); > pFindReply->replyHdr.msghdr.msgID = LZSERVICEMSGS_FINDSERVICE; > pFindReply->replyHdr.rc = LZSERVICERC_SERVICENOTFOUND; > zmq_msg_send(&sReplyMessage,pReplySocket,0); > zmq_msg_close(&sReplyMessage); > if (options.debugMode) { > fprintf(stderr,"%s: service name, %s, not > found\n",routineName,pServiceName); > } > } > > > return 0; > }
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
