[
https://issues.apache.org/jira/browse/YARN-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620735#comment-16620735
]
Antal Bálint Steinbach commented on YARN-8655:
----------------------------------------------
Hi [~uranus],
Thanks for uploading the patch.
If I understand correctly the reason to replace take() with poll() is that you
don't want to do synchronization on a blocking call.
I have some suggestions about it:
1) Method name FSStarvedApps.take() is not consistent since it is a poll not a
take
2) Comments tell in FSStarvedApps that take is a blocking method
3) The behavior of the class changed, in your solution the thread will run like
an endless loop very often and will do a high-cost lock in every iteration.
Maybe it would worth to add some sleep in the loop just like before.
> FairScheduler: FSStarvedApps is not thread safe
> -----------------------------------------------
>
> Key: YARN-8655
> URL: https://issues.apache.org/jira/browse/YARN-8655
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Affects Versions: 3.0.0
> Reporter: Zhaohui Xin
> Assignee: Zhaohui Xin
> Priority: Major
> Attachments: YARN-8655.patch
>
>
> *FSStarvedApps is not thread safe, this may make one starve app is processed
> for two times continuously.*
> For example, when app1 is fair share starved, it has been added to
> appsToProcess. After that, app1 is taken but appBeingProcessed is not yet
> update to app1. At the moment, app1 is starved by min share, so this app is
> added to appsToProcess again! Because appBeingProcessed is null and
> appsToProcess also have not this one.
> {code:java}
> void addStarvedApp(FSAppAttempt app) {
> if (!app.equals(appBeingProcessed) && !appsToProcess.contains(app)) {
> appsToProcess.add(app);
> }
> }
> FSAppAttempt take() throws InterruptedException {
> // Reset appBeingProcessed before the blocking call
> appBeingProcessed = null;
> // Blocking call to fetch the next starved application
> FSAppAttempt app = appsToProcess.take();
> appBeingProcessed = app;
> return app;
> }
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]