[
https://issues.apache.org/jira/browse/YARN-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357939#comment-16357939
]
Rohith Sharma K S commented on YARN-7835:
-----------------------------------------
bq. so I guess the code is meant to be thread safe.
Since these methods are event driven, ideally this shouldn't be thread safe.
But in our code i.e stopContainer
schedules a thread that run after some time to remove application. This made me
add synchronous block only for masterContainers set. I will change as per you
comment which is thread safe forever.
bq. Am I missing something?
Since stopContainer schedules a thread default is 1sec where it removes the
application from collectors. Without current fix it check for application in
loop and come out once application is removed from collector. It fails in next
assert because we are expecting this application to present. I guess waiting
for 2 sec in loop should be fine, otherwise we can reduce it to 1.5 seconds.
bq. I think we could reuse GenericTestUtils.waitFor().
It appears whole class to be refactored with this change. Let me see for
feasibility.
> [Atsv2] Race condition in NM while publishing events if second attempt
> launched on same node
> --------------------------------------------------------------------------------------------
>
> Key: YARN-7835
> URL: https://issues.apache.org/jira/browse/YARN-7835
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Priority: Critical
> Attachments: YARN-7835.001.patch
>
>
> It is observed race condition that if master container is killed for some
> reason and launched on same node then NMTimelinePublisher doesn't add
> timelineClient. But once completed container for 1st attempt has come then
> NMTimelinePublisher removes the timelineClient.
> It causes all subsequent event publishing from different client fails to
> publish with exception Application is not found. !
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]