[jira] [Commented] (YARN-333) Schedulers cannot control the queue-name of an application

Alejandro Abdelnur (JIRA) Thu, 18 Apr 2013 13:02:15 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635606#comment-13635606
 ]


Alejandro Abdelnur commented on YARN-333:
-----------------------------------------

Stepping back a bit.

In Hadoop 1, using fair-scheduler, the following is possible:

*1 On job submission, the scheduler resolves/reassigns the queue of the job. 
For example, your jobconf states 'default' as queue, and the fair-scheduler 
converts the queue to the user primary group
*2 For a running job, via the fair-scheduler UI, it is possible to change the 
queue of the job (enforcing queue ACLs)

While there are a few minor quirks (the JobInProgress does not get update to 
the right queeu), this works fine and you can see the new queue in most of the 
UI (scheduler owns most of the UI that shows queues).

In Hadoop 2, using fair-scheduler:

*1, this is still doable but all the feedback to the user is provided by the RM 
which does not know about the queue name change. Thus there is a split-brain 
perception and it can lead to wrong decisions by admin trying to troubleshoot 
cluster utilization

*2, this is not currently possible


I see 2 ways to go about:

*A make the necessary changes for the fair-scheduler to be able to provide #1 
and #2 as a fair-scheduler only solution

*B make the necessary changes so all schedulers support this capability

To do #A, the fair-scheduler should be able, via the RMContext, change the 
queue of job in the RM. This would take care of #1 and #2. Still it is the 
responsibility of the fair-scheduler to provide an end-point to the user/admin 
to perfom #2.

To do #B, the ClientRMProtocol would have a new 'changeQueue(String queueName)' 
method, the YarnScheduler would have a new 'String getActualQueue(String 
queueName)' method. On job submission the RM would call the scheduler 
'getActualQueue()' method to resolve the actual queue for the job (this takes 
care of #1). On a 'changeQueue()' call, the RM also calls the scheduler 
'getActualQueue()'.

Note that on both approaches, the scheduler is still responsible for enforcing 
queue ACLs as it is done today.

The benefit I see in #B, is that will be consistent for the different scheduler 
implementations. It will also leverage the existing client RPC, so no need to 
worry about a different endpoint and the security considerations this means.




                
> Schedulers cannot control the queue-name of an application
> ----------------------------------------------------------
>
>                 Key: YARN-333
>                 URL: https://issues.apache.org/jira/browse/YARN-333
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.0.2-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: YARN-333.patch
>
>
> Currently, if an app is submitted without a queue, RMAppManager sets the 
> RMApp's queue to "default".
> A scheduler may wish to make its own decision on which queue to place an app 
> in if none is specified. For example, when the fair scheduler 
> user-as-default-queue config option is set to true, and an app is submitted 
> with no queue specified, the fair scheduler should assign the app to a queue 
> with the user's name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-333) Schedulers cannot control the queue-name of an application

Reply via email to