[ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737113#comment-14737113
 ] 

Varun Saxena commented on YARN-4000:
------------------------------------

Attached a patch with following changes :
# If fail fast is false, app is killed both when queue is removed and when 
queue becomes parent on restart.
# If fail fast is true, an exception is thrown in both cases.
# Renamed QueueNotFoundException to QueueException to avoid creating new class 
for different cases.
# Added a new RMAppKillEvent class to send kill events to RMAppImpl. This has 
been done to capture a specific diagnostic message to indicate why application 
has been killed. Because currently when an app is killed the diagnostic message 
is always "Application killed by user." which is not quite suitable in this 
case.

> RM crashes with NPE if leaf queue becomes parent queue during restart
> ---------------------------------------------------------------------
>
>                 Key: YARN-4000
>                 URL: https://issues.apache.org/jira/browse/YARN-4000
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler, resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-4000.01.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to