[ 
https://issues.apache.org/jira/browse/YARN-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166839#comment-16166839
 ] 

Wangda Tan commented on YARN-6840:
----------------------------------

Thanks [~jhung] for updating the patch,

bq. For leveldb and zk, it will ignore it and use the scheduler configuration 
persisted in the store.
Then I suggest to add this as a Javadoc to the base class's method, this should 
be respected by all future implementations. Otherwise behavior will be changed 
when different store is configured. 

bq. Not sure about this, then we are doing the reservation system 
reinitialization inside scheduler, so every time scheduler#reinitialize is 
called, the reservation system is also initialized, not sure if this is the 
desired behavior. Also we would need to duplicate the reservation system 
reinitialization for all schedulers, or make ResourceScheduler an abstract 
class and add it there. ... 
I just checked the code, probably I should use a different way to describe the 
problem: 
There're two different code path to refresh scheduler config:
Path #1 (When mutation disabled) : Client -> AdminService -> 
Scheduler/ReservationSystem#reinitialize
Path #2 (When mutation enabled) : Client -> RMWebService -> Scheduler -> 
ConfProvider (do log persistent) -> AdminService -> 
Scheduler/ReservationSystem#reinitialize -> ConfProvider (confirm or discard 
mutation).

Please note that in the different code path, ordering of scheduler and 
AdminService is inverted, this is confusing and could possibly cause deadlock, 
etc.

Here's my proposal:
1) Change MutableConfigurationProvider#mutateConfiguration to 
log-scheduler-config-mutation, it will do following things:
a. Merge mutations to existing configs.
b. Call confStore.logMutation to persistent it.

2) Add two new method to MutableConfigurationProvider
a. Confirm last mutation - confirm last logged mutation. (Just call 
YarnConfigurationStore#confirmMutation(valid = true))
b. Discard last mutation - discard last logged mutation. (Just call 
YarnConfigurationStore#confirmMutation(valid = false))
And is it possible to remove id field in the confirmMutation method? Should we 
allow at most one pending mutation?

One we have above, the call path#2 becomes:
(1) Client -> RMWebService#updateSchedulerConfiguration -> 
MutableConfigurationProvider#log-scheduler-config-mutation 
(2) ... RMWebService#updateSchedulerConfiguration -> AdminService#refreshQueues 
-> Scheduler/ReservationSystem#reinitialize.

If reinitialize succeeded:
(3) ... RMWebService#updateSchedulerConfiguration -> 
MutableConfigurationProvider#confirmLastChange

If reinitialize failed:
(4) RMWebService#updateSchedulerConfiguration -> 
MutableConfigurationProvider#discardLastChange


> Implement zookeeper based store for scheduler configuration updates
> -------------------------------------------------------------------
>
>                 Key: YARN-6840
>                 URL: https://issues.apache.org/jira/browse/YARN-6840
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Jonathan Hung
>         Attachments: YARN-6840-YARN-5734.001.patch, 
> YARN-6840-YARN-5734.002.patch, YARN-6840-YARN-5734.003.patch, 
> YARN-6840-YARN-5734.004.patch, YARN-6840-YARN-5734.005.patch
>
>
> Right now there is only in-memory and leveldb based configuration store 
> supported. Need one which supports RM HA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to