Naganarasimha G R commented on YARN-3995:

bq. Are you thinking of cases where the AM crashes? If the app finishes 
normally, this sequence does not happen, right?
Well was just having a hunch that suppose AM finishes before its containers 
finishes (like AM will note once container informs AM through umbilical 
protocol that its finished but may be container is not yet finished one of the 
possible reasons being Timeline client has not yet finished flushing the ATS 
events or any other reason for cleaning up)

> Some of the NM events are not getting published due race condition when AM 
> container finishes in NM 
> ----------------------------------------------------------------------------------------------------
>                 Key: YARN-3995
>                 URL: https://issues.apache.org/jira/browse/YARN-3995
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3995-feature-YARN-2928.v1.001.patch
> As discussed in YARN-3045:  While testing in TestDistributedShell found out 
> that few of the container metrics events were failing as there will be race 
> condition. When the AM container finishes and removes the collector for the 
> app, still there is possibility that all the events published for the app by 
> the current NM and other NM are still in pipeline, 

This message was sent by Atlassian JIRA

Reply via email to