Hi Bill, thanks for your answer.
2018-02-24 21:49 GMT+01:00 Bill Torpey <[email protected]>: > > ... > If T2 and T4 were application code, this would be a clear violation of > ZMQ’s threading rules (assuming “legacy”, non-thread-safe socket types). > Right. > For instance, one technique would be to perform initialization (e.g., > bind’s) in the main thread, and only after that is done spin up other > threads to do processing. In this case, TSAN wouldn't have any way to know > that the application guarantees that accesses cannot result in a race, so > TSAN would flag it. I’ve gotten in the habit of using mutexes to protect > code like this even though it should not strictly be needed, just to keep > tools like TSAN happy, and also because I don’t know the ZMQ code well > enough to be 100% certain that the mutexes are not necessary — better safe > than sorry! > Yeah, that's a possibility but it results in a lot of "clutter" that decrease code readability and makes it harder to maintain in the long run... This situation is different, though, since T4 is not an application thread > — it’s an internal ZMQ worker thread. So, I think in this case we kind of > have to accept that ZMQ is doing the right thing here. > At least, that’s the approach I’ve been taking. When I instrument my apps > and libraries with TSAN I specifically do NOT instrument ZMQ, and I also > use the “called_from_lib:libzmq.so” suppression (which is listed as an > example for TSAN: https://github.com/google/sanitizers/wiki/ > ThreadSanitizerSuppressions). > Understood. I have a question though: if you use “called_from_lib:libzmq.so” suppression, are you able to spot the race condition due to T2 and T4 being application threads (instead of being 1 application and 1 zmq internal)? I wonder if TSAN, detecting that one of the 2 threads generating the data race is inside ZMQ, entirely suppress the race warning or instead will suppress only race conditions involving 2 internal zmq threads.. Instrumenting libzmq and/or omitting the suppression causes a LOT of > warnings, esp. in the ZMQ worker threads. So, unless I'm willing to commit > the time and effort to go through and investigate each of these warnings, I > feel I have little choice but to accept that at this point in its lifetime > ZMQ should be race-free for all practical purposes. > > FWIW, I’ve done fairly extensive testing, and specifically stress testing, > and have yet to find anything that looks like an honest-to-goodness bug in > ZMQ. (Which is not to say that the docs are always clear about what to > expect in certain situations ;-) I did have one problem which appears to > have been a bug in epoll, and which was resolved by upgrading Linux, but > that’s it. > I agree, I was just surprised to see so many warnings... > BTW, there’s an excellent overview of how this all works at > http://zeromq.org/whitepapers:architecture — although it’s somewhat old, > it appears to still be relatively accurate. > thanks for the link, I read that and it's quite interesting. However it does not mention how multi-thread safety is achieved. Just out of curiosity I will take a look at ypipe implementation. > I tried forcing use of C++11 internal atomics and all the warnings > about zmq::atomic_ptr_t<> disappear (I still get tons about ypipe and > "pipe").. > maybe it's best to prefer C++11 atomics when available? > > > I would actually suggest using plain old pthread for synchronization > primitives — TSAN’s support for pthread is pretty mature. Support for C++11 > is a bit less mature, and has certain constraints ( > https://clang.llvm.org/docs/ThreadSanitizer.html): > However using old pthread for synchronization you probably loose a lot in performances... if C++11 atomics are supported by TSAN just fine, why not preferring them over the GCC intrinsics? > *ThreadSanitizer is in beta stage. It is known to work on large C++ > programs using pthreads, but we do not promise anything (yet). C++11 > threading is supported with llvm libc++. The test suite is integrated into > CMake build and can be run with make check-tsan command.* > > > In particular, to use C++11 you also need to use llvm’s libcxx, rather > than gnu’s libstdc++. > I think that the quoted text is referring to "C++11 threading" that is to http://en.cppreference.com/w/cpp/thread As I mentioned from a simple test I do see that at least gcc 5.3.0 TSAN is handling C++11 atomics fine (without forcing you to link LLVM libc++) > Btw I will try to write a suppression file for ThreadSanitizer and ZMQ... > I just start to doubt: is all ZMQ code really race-free? :) > > > As mentioned above, that is likely to be a huge undertaking. At this > point, I think the only confirmation we have that ZMQ is race-free is based > on “black-box” testing. The good news is that ZMQ appears to be used > widely enough that the code is well-tested. (Of course, failure to find a > bug doesn’t prove that there isn’t one, so having a rigorous analysis of > ZMQ’s threading behavior would be A Good Thing — but a big job). > Yeah, I agree! Thanks, Francesco
_______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
