[
https://issues.apache.org/jira/browse/YARN-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166839#comment-16166839
]
Wangda Tan commented on YARN-6840:
----------------------------------
Thanks [~jhung] for updating the patch,
bq. For leveldb and zk, it will ignore it and use the scheduler configuration
persisted in the store.
Then I suggest to add this as a Javadoc to the base class's method, this should
be respected by all future implementations. Otherwise behavior will be changed
when different store is configured.
bq. Not sure about this, then we are doing the reservation system
reinitialization inside scheduler, so every time scheduler#reinitialize is
called, the reservation system is also initialized, not sure if this is the
desired behavior. Also we would need to duplicate the reservation system
reinitialization for all schedulers, or make ResourceScheduler an abstract
class and add it there. ...
I just checked the code, probably I should use a different way to describe the
problem:
There're two different code path to refresh scheduler config:
Path #1 (When mutation disabled) : Client -> AdminService ->
Scheduler/ReservationSystem#reinitialize
Path #2 (When mutation enabled) : Client -> RMWebService -> Scheduler ->
ConfProvider (do log persistent) -> AdminService ->
Scheduler/ReservationSystem#reinitialize -> ConfProvider (confirm or discard
mutation).
Please note that in the different code path, ordering of scheduler and
AdminService is inverted, this is confusing and could possibly cause deadlock,
etc.
Here's my proposal:
1) Change MutableConfigurationProvider#mutateConfiguration to
log-scheduler-config-mutation, it will do following things:
a. Merge mutations to existing configs.
b. Call confStore.logMutation to persistent it.
2) Add two new method to MutableConfigurationProvider
a. Confirm last mutation - confirm last logged mutation. (Just call
YarnConfigurationStore#confirmMutation(valid = true))
b. Discard last mutation - discard last logged mutation. (Just call
YarnConfigurationStore#confirmMutation(valid = false))
And is it possible to remove id field in the confirmMutation method? Should we
allow at most one pending mutation?
One we have above, the call path#2 becomes:
(1) Client -> RMWebService#updateSchedulerConfiguration ->
MutableConfigurationProvider#log-scheduler-config-mutation
(2) ... RMWebService#updateSchedulerConfiguration -> AdminService#refreshQueues
-> Scheduler/ReservationSystem#reinitialize.
If reinitialize succeeded:
(3) ... RMWebService#updateSchedulerConfiguration ->
MutableConfigurationProvider#confirmLastChange
If reinitialize failed:
(4) RMWebService#updateSchedulerConfiguration ->
MutableConfigurationProvider#discardLastChange
> Implement zookeeper based store for scheduler configuration updates
> -------------------------------------------------------------------
>
> Key: YARN-6840
> URL: https://issues.apache.org/jira/browse/YARN-6840
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: Jonathan Hung
> Attachments: YARN-6840-YARN-5734.001.patch,
> YARN-6840-YARN-5734.002.patch, YARN-6840-YARN-5734.003.patch,
> YARN-6840-YARN-5734.004.patch, YARN-6840-YARN-5734.005.patch
>
>
> Right now there is only in-memory and leveldb based configuration store
> supported. Need one which supports RM HA.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]