[
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15494440#comment-15494440
]
Daniel Templeton commented on YARN-3141:
----------------------------------------
Continuing with more comments on v2. Sorry, I started these before you
uploaded v3. These comments are a little more speculative. I'm not 100%
certain that everything I'm recommending is safe. :)
* {{SchedulerApplicationAttempt.getLiveContainersMap()}} should be default
visibility and {{@VisibleForTesting}}
* {{SchedulerApplicationAttempt.addRMContainer()}},
{{SchedulerApplicationAttempt.removeRMContainer()}},
{{SchedulerApplicationAttempt.updateResourceRequests()}},
{{SchedulerApplicationAttempt.recoverResourceRequestsForContainer()}},
{{SchedulerApplicationAttempt.reserve()}}, and
{{SchedulerApplicationAttempt.updateBlacklist()}} should have the write locks
pushed down to inside the _if_
* {{SchedulerApplicationAttempt.getHeadroom()}} and
{{SchedulerApplicationAttempt.getResourceLimit()}} are identical.
{{SchedulerApplicationAttempt.getResourceLimit()}} is not used outside
{{SchedulerApplicationAttempt}}
* In {{SchedulerApplicationAttempt.resetSchedulingOpportunities()}}, is the
write lock needed?
* In {{SchedulerApplicationAttempt.getLiveContainers()}} is the read lock
needed?
* In {{SchedulerApplicationAttempt.stop()}} the {{isStopped}} update can happen
before the lock
* In {{SchedulerApplicationAttempt.getReservedContainers()}} the lock should
only cover the _for_ loop
* In {{FSAppAttempt.getAllowedLocalityLevel()}},
{{FSAppAttempt.getAllowedLocalityLevelByTime()}},
{{FSAppAttempt.setReservation()}}, and {{FSAppAttempt.clearReservation()}} the
write lock acquisition can be delayed until after the arg validation
* There's an unused import in {{FaCiSchedulerApp}}
* In {{FaCiSchedulerApp.containerCompleted()}} the write lock acquisition can
be delayed until after removing from {{liveContainers}}
* In {{FaCiSchedulerApp.allocate()}} the write lock acquisition can be delayed
until after the stop check, and maybe after the sanity check
* In {{FaCiSchedulerApp.unreserve()}} the write lock acquisition can be delayed
until after canceling the increase request
* In {{FaCiSchedulerApp.markContainerForPreemption()}} the write lock
acquisition can be push down inside the _if_
> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --------------------------------------------------------------------------
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager, scheduler
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp,
> as mentioned in YARN-3091, a possible solution is using read/write lock.
> Other fine-graind locks for specific purposes / bugs should be addressed in
> separated tickets.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]