[ 
https://issues.apache.org/jira/browse/YARN-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397378#comment-15397378
 ] 

Rohith Sharma K S commented on YARN-5436:
-----------------------------------------

Thanks Zhiyuan for providing patch!  Basically I see that patch is reverting 
YARN-2991. 

Couples of doubts, does small tiny race is causing TEZ test failures? If so 
would it be good to fix in AsyncDispatcher rather adding full duplicate code. 
How about adding additional check before adding into event queue to avoid a 
race?
{code}
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
index f5361c8..a162690 100644
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
@@ -247,6 +247,12 @@ public void handle(Event event) {
         LOG.warn("Very low remaining capacity in the event-queue: "
             + remCapacity);
       }
+
+      if (blockNewEvents) {
+        drained = eventQueue.isEmpty();
+        return;
+      }
+
       try {
         eventQueue.put(event);
       } catch (InterruptedException e) {
{code}

> Race in AsyncDispatcher can cause random test failures in Tez(probably YARN 
> also )
> ----------------------------------------------------------------------------------
>
>                 Key: YARN-5436
>                 URL: https://issues.apache.org/jira/browse/YARN-5436
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>         Attachments: YARN-5436.1.patch, YARN-5436.2.patch, YARN-5436.3.patch, 
> YARN-5436.4.patch
>
>
> In YARN-2264, a race in DrainDispatcher was fixed. Unfortunately, it also 
> exists in AsyncDispatcher (this was found and ignored in YARN-3878 but never 
> documented...). In YARN-2991, another DrainDispatcher bug was fixed by 
> letting DrainDispatcher reuse some AsyncDispatcher method because 
> AsyncDispatcher doesn't have such issue. However, this shadows YARN-2264, and 
> now similar race reappears in Tez unit tests (probably also YARN unit tests 
> also).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to