[ https://issues.apache.org/jira/browse/YARN-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062551#comment-14062551 ]
Jian He commented on YARN-2264: ------------------------------- patch looks good. > Race in DrainDispatcher can cause random test failures > ------------------------------------------------------ > > Key: YARN-2264 > URL: https://issues.apache.org/jira/browse/YARN-2264 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Siddharth Seth > Assignee: Li Lu > Attachments: YARN-2264-070814.patch > > > This is what can happen. > This is the potential race. > DrainDispatcher is started via serviceStart() . As a last step, this starts > the actual dispatcher thread (eventHandlingThread.start() - and returns > immediately - which means the thread may or may not have started up by the > time start returns. > Event sequence: > UserThread: calls dispatcher.getEventHandler().handle() > This sets drained = false, and a context switch happens. > DispatcherThread: starts running > DispatcherThread drained = queue.isEmpty(); -> This sets drained to true, > since Thread1 yielded before putting anything into the queue. > UserThread: actual.handle(event) - which puts the event in the queue for the > dispatcher thread to process, and returns control. > UserThread: dispatcher.await() - Since drained is true, this returns > immediately - even though there is a pending event to process. -- This message was sent by Atlassian JIRA (v6.2#6252)