I’m using clang’s Thread Sanitizer for a similar purpose, and just happened to 
notice that the TSAN docs use ZeroMQ as one of the example suppressions:  
https://github.com/google/sanitizers/wiki/ThreadSanitizerSuppressions 
<https://github.com/google/sanitizers/wiki/ThreadSanitizerSuppressions>

I assume that the reason for suppressing libzmq.so has to do with (legacy) 
sockets not being thread-safe, so the code may exhibit race conditions that are 
irrelevant given that the code is not intended to be called from multiple 
threads.

FWIW, you may want to check out the clang sanitizers — they have certain 
advantages over valgrind — (faster, multi-threaded, etc.) if you are able to 
instrument the code at build time.


> On Feb 23, 2018, at 6:52 AM, Luca Boccassi <luca.bocca...@gmail.com> wrote:
> 
> On Fri, 2018-02-23 at 12:22 +0100, Francesco wrote:
>> Hi all,
>> I'm trying to further debug the problem I described in my earlier
>> mail (
>> https://lists.zeromq.org/pipermail/zeromq-dev/2018-February/032303.ht
>> ml) so
>> I decided to use Helgrind to find race conditions in my code.
>> 
>> My problem is that apparently Helgrind 3.12.0 is reporting race
>> conditions
>> against zmq::atomic_ptr_t<> implementation.
>> Now I know that Helgrind has troubles with C++11 atomics but by
>> looking at
>> the code I see that ZMQ is not using them (note: I do have
>> ZMQ_ATOMIC_PTR_CXX11 defined but I also have ZMQ_ATOMIC_PTR_INTRINSIC
>> defined, so the latter wins!).
>> 
>> In particular Helgrind 3.12.0 tells me that:
>> 
>> 
>> ==00:00:00:11.885 29399==
>> ==00:00:00:11.885 29399== *Possible data race during read of size 8
>> at
>> 0xB373BF0 by thread #4*
>> ==00:00:00:11.885 29399== Locks held: none
>> ==00:00:00:11.885 29399==    at 0x6BD79AB:
>> *zmq::atomic_ptr_t<zmq::command_t>::cas*(zmq::command_t*,
>> zmq::command_t*)
>> (atomic_ptr.hpp:150)
>> ==00:00:00:11.885 29399==    by 0x6BD7874:
>> zmq::ypipe_t<zmq::command_t,
>> 16>::check_read() (ypipe.hpp:147)
>> ==00:00:00:11.885 29399==    by 0x6BD7288:
>> zmq::ypipe_t<zmq::command_t,
>> 16>::read(zmq::command_t*) (ypipe.hpp:165)
>> ==00:00:00:11.885 29399==    by 0x6BD6FE7:
>> zmq::mailbox_t::recv(zmq::command_t*, int) (mailbox.cpp:98)
>> ==00:00:00:11.885 29399==    by 0x6BD29FC:
>> zmq::io_thread_t::in_event()
>> (io_thread.cpp:81)
>> ==00:00:00:11.885 29399==    by 0x6BD05C1: zmq::epoll_t::loop()
>> (epoll.cpp:188)
>> ==00:00:00:11.885 29399==    by 0x6BD06C3:
>> zmq::epoll_t::worker_routine(void*) (epoll.cpp:203)
>> ==00:00:00:11.885 29399==    by 0x6C18BA5: thread_routine
>> (thread.cpp:109)
>> ==00:00:00:11.885 29399==    by 0x4C2F837: mythread_wrapper
>> (hg_intercepts.c:389)
>> ==00:00:00:11.885 29399==    by 0x6E72463: start_thread
>> (pthread_create.c:334)
>> ==00:00:00:11.885 29399==    by 0x92F901C: clone (clone.S:109)
>> ==00:00:00:11.885 29399==
>> ==00:00:00:11.885 29399== This conflicts with a previous write of
>> size 8 by
>> thread #2
>> ==00:00:00:11.885 29399== Locks held: 1, at address 0xB373C08
>> ==00:00:00:11.885 29399==    at 0x6BD77F4:
>> *zmq::atomic_ptr_t<zmq::command_t>::set*(zmq::command_t*)
>> (atomic_ptr.hpp:90)
>> ==00:00:00:11.885 29399==    by 0x6BD7422:
>> zmq::ypipe_t<zmq::command_t,
>> 16>::flush() (ypipe.hpp:125)
>> ==00:00:00:11.885 29399==    by 0x6BD6DF5:
>> zmq::mailbox_t::send(zmq::command_t const&) (mailbox.cpp:63)
>> ==00:00:00:11.885 29399==    by 0x6BB9128:
>> zmq::ctx_t::send_command(unsigned int, zmq::command_t const&)
>> (ctx.cpp:438)
>> ==00:00:00:11.885 29399==    by 0x6BE34CE:
>> zmq::object_t::send_command(zmq::command_t&) (object.cpp:474)
>> ==00:00:00:11.885 29399==    by 0x6BE26F8:
>> zmq::object_t::send_plug(zmq::own_t*, bool) (object.cpp:220)
>> ==00:00:00:11.885 29399==    by 0x6BE68E2:
>> zmq::own_t::launch_child(zmq::own_t*) (own.cpp:87)
>> ==00:00:00:11.885 29399==    by 0x6C03D6C:
>> zmq::socket_base_t::add_endpoint(char const*, zmq::own_t*,
>> zmq::pipe_t*)
>> (socket_base.cpp:1006)
>> ==00:00:00:11.885 29399==  Address 0xb373bf0 is 128 bytes inside a
>> block of
>> size 224 alloc'd
>> ==00:00:00:11.885 29399==    at 0x4C2A6FD: operator new(unsigned
>> long,
>> std::nothrow_t const&) (vg_replace_malloc.c:376)
>> ==00:00:00:11.885 29399==    by 0x6BB8B8D:
>> zmq::ctx_t::create_socket(int)
>> (ctx.cpp:351)
>> ==00:00:00:11.885 29399==    by 0x6C284D5: zmq_socket (zmq.cpp:267)
>> ==00:00:00:11.885 29399==    by 0x6143809:
>> ZmqClientSocket::Config(PubSubSocketConfig const&)
>> (ZmqRequestReply.cpp:303)
>> ==00:00:00:11.885 29399==    by 0x6144069:
>> ZmqClientMultiSocket::Config(PubSubSocketConfig const&)
>> (ZmqRequestReply.cpp:407)
>> ==00:00:00:11.885 29399==    by 0x61684EF: client_thread_main(void*)
>> (ZmqRequestReplyUnitTests.cpp:132)
>> ==00:00:00:11.886 29399==    by 0x4C2F837: mythread_wrapper
>> (hg_intercepts.c:389)
>> ==00:00:00:11.886 29399==    by 0x6E72463: start_thread
>> (pthread_create.c:334)
>> ==00:00:00:11.886 29399==    by 0x92F901C: clone (clone.S:109)
>> ==00:00:00:11.886 29399==  Block was alloc'd by thread #2
>> 
>> 
>> Is this a known (and ignorable) issue with  zmq::atomic_ptr_t<>?
>> 
>> Thanks,
>> Francesco
> 
> Yeah I started trying to put together a suppression file but never
> finished it:
> 
> https://github.com/bluca/libzmq/commit/fb9ee9da7631f9506cbfcd6db29a284ae6e9651e
>  
> <https://github.com/bluca/libzmq/commit/fb9ee9da7631f9506cbfcd6db29a284ae6e9651e>
> 
> Hope to have time to finish working on it eventually (feel free to
> contribute!) as it's very noisy right now, as it can't know about our
> lock-free queue implementation without the custom suppression file
> 
> -- 
> Kind regards,
> Luca Boccassi_______________________________________________
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq.org>
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev 
> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to