I am having a problem with what looks like a double free in zmq.  I assume it 
is something I am doing wrong, however I have looked fairly carefully and can 
not find the problem.  I will document this from the results back to the source 
code.

During execution I get:

==============> error during executing 
<==============================================

*** glibc detected *** 
/home/brad/mavenSDK/lz_serviceRegistry/Debug/lz_serviceRegistry:double free or 
corruption (!prev): 0x00000000006096e0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x36b287c80e]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x180b3)[0x7ffff7bcd0b3]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2f9c3)[0x7ffff7be49c3]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2fe7b)[0x7ffff7be4e7b]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2637e)[0x7ffff7bdb37e]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x21512)[0x7ffff7bd6512]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x15194)[0x7ffff7bca194]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x1429e)[0x7ffff7bc929e]
/home/brad/Downloads/zeromq-3.2.1/src/.libs/libzmq.so.3(+0x2aba6)[0x7ffff7bdfba6]

================>Environment<======================

The service registry is a "daemon" that listens for beacons from service 
providers, Records those services.  It also listens for requests that "find" to 
find the endpoint for those services.  

It is during this request processing, while constructing the and sending the 
reply that the problem occurs.  The following two routines are where the 
problem occurs.  The second routine is the driving routine, it receives control 
when a "find" message is received in an upper polling loop.

 The sendResponse routine is invoked, because the requested service was found.  
The only thing odd I can think of that I am doing, is the subroutine is the one 
that does the zmq_msg_init_size(), and upon return the zmq_msg_send and 
zmq_msg_close are done.

I have numbered the sequence the statements are actually executed in.  Any 
thoughts on how to debug this problem would be appreciated

============>Source code<==============


typedef struct MSGREPLY {
        zmq_msg_t               *pPriorMessage;
        zmq_msg_t               sPriorMessage;
}MSGREPLY;

/**
 * This routine will send a response to the find message.  This is a "delayed" 
response because
 * we do not know if there will be any "more" responses to be sent.
 * We utilize the multi-part message facility of zmq to do this.  Each part of 
the message
 * is a service that matches the requestors specification, that is for a find 
request
 * there may be multiple service providers for the requested service.  For list 
request multiple services may exist.
 */
static int sendResponse(void * context, void *pReplySocket,
                                                LZSERVICE *pServiceEntry,
                                                MSGREPLY *pMsgReply,
                                                LZSERVICEMSGS msgID) {

        int                                     msgSize;
        LZSERVICEREPLY          *pReply;

====>8  if (pMsgReply->pPriorMessage) {
                        
zmq_msg_send(pMsgReply->pPriorMessage,pReplySocket,ZMQ_SNDMORE);
                        zmq_msg_close(pMsgReply->pPriorMessage);
                }
====>9  pMsgReply->pPriorMessage = &pMsgReply->sPriorMessage;
====>10 msgSize = sizeof(LZSERVICEREPLYHDR)+pServiceEntry->serviceLen;

====>11    zmq_msg_init_size (pMsgReply->pPriorMessage, msgSize);
====>12    pReply = zmq_msg_data(pMsgReply->pPriorMessage);
====>13    LZSERVICE_SET_REPLY(pReply);
====>14    pReply->replyHdr.rc = 0;
====>15    pReply->replyHdr.msghdr.msgID = msgID;
====>16    
memcpy(&pReply->findReply.service,pServiceEntry,pServiceEntry->serviceLen);
====>17    pReply->findReply.service.pServiceName = NULL;
====>18    pReply->findReply.service.pServiceURI = NULL;

====>19 return 0;
}

/**
 * This routine will find the requested service
 * This is a "request" message sent on a request socket.  That is it requires a 
response
 * Since there may be multiple instances of a given service running on multiple 
blades or multiple processes on a blade
 * duplicate service name entries may exist.  We return all of the instances 
that match the requested name
 * When there is more than one, we send subsequent service responses in 
separate message frames
 */
static int processFindService(void * context, LZSERVICEMESSAGE *pFindMessage, 
void *pReplySocket) {

        int                                     msgSize;
        LZSERVICEREPLY          *pFindReply;
        LZSERVICE                       *pServiceEntry;
        char                            *pServiceName;
        GList                           *iterator = NULL;
        MSGREPLY                        sReply;
        zmq_msg_t                       sReplyMessage;
        char                            routineName[] = 
"lzServiceRegistry:processFindService";

===>1   pServiceName = pFindMessage->findService.aServiceName;
===>2   sReply.pPriorMessage = NULL;

        
/*--------------------------------------------------------------------------+
         * Look for matching service names in the global service table          
        |
         * for each found entry, construct a response and send it               
                        |
         
*-------------------------------------------------------------------------*/
====>3  for (iterator = pGlobalServices; iterator; iterator = iterator->next) {
====>4          pServiceEntry = iterator->data;
====>5          if (pServiceEntry->serviceNameLen ==  
pFindMessage->findService.serviceNameLen) {
====>6                  if 
(!memcmp(pServiceEntry->pServiceName,pServiceName,pServiceEntry->serviceNameLen))
 {
====>7                          
sendResponse(context,pReplySocket,pServiceEntry,&sReply, 
LZSERVICEMSGS_FINDSERVICE);
                                if (options.debugMode) {
                                        fprintf(stderr,"%s: service %s was 
found\n",routineName,pServiceName);
                                }
                        }
                }
        }
====>20 if (sReply.pPriorMessage) {
====>21         zmq_msg_send(sReply.pPriorMessage,pReplySocket,0);     
=============This is where I get the exception ===================
                zmq_msg_close(sReply.pPriorMessage);
        }
        else {
                msgSize = sizeof(LZSERVICEREPLY);
            zmq_msg_init_size (&sReplyMessage, msgSize);
            pFindReply = zmq_msg_data(&sReplyMessage);
            LZSERVICE_SET_REPLY(pFindReply);
            pFindReply->replyHdr.msghdr.msgID = LZSERVICEMSGS_FINDSERVICE;
            pFindReply->replyHdr.rc = LZSERVICERC_SERVICENOTFOUND;
                zmq_msg_send(&sReplyMessage,pReplySocket,0);
                zmq_msg_close(&sReplyMessage);
                if (options.debugMode) {
                        fprintf(stderr,"%s: service name, %s, not 
found\n",routineName,pServiceName);
                }
        }


        return 0;
}
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to