[
https://issues.apache.org/jira/browse/YARN-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648686#comment-14648686
]
Bibin A Chundatt commented on YARN-3940:
----------------------------------------
Hi [~leftnoteasy]
Thank you for review comments.
{quote}
We should check usage as I mentioned at:
https://issues.apache.org/jira/browse/YARN-3940?focusedCommentId=14633876&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14633876.
{quote}
Will check how handle this too.
{quote}
we may need to consider how to deal with node label update, currently, if we
change labels on a node, all containers running on the node will be killed. I
suggest to clear think about both of the problem before moving forward.
{quote}
As i understand the below cases containers shouldn't be killed
# Running containers of applications submitted for default partition on
partition with label incase of exclusivity(false)
# when queue is having access to new label / Node
Any other case ?
Can we move second part to separate jira for discussion ?
Thoughts? Please do correct me if i am wrong.
> Application moveToQueue should check NodeLabel permission
> ----------------------------------------------------------
>
> Key: YARN-3940
> URL: https://issues.apache.org/jira/browse/YARN-3940
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-3940.patch, 0002-YARN-3940.patch
>
>
> Configure capacity scheduler
> Configure node label an submit application {{queue=A Label=X}}
> Move application to queue {{B}} and x is not having access
> {code}
> 2015-07-20 19:46:19,626 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
> Application attempt appattempt_1437385548409_0005_000001 released container
> container_e08_1437385548409_0005_01_000002 on node: host:
> host-10-19-92-117:64318 #containers=1 available=<memory:2560, vCores:15>
> used=<memory:512, vCores:1> with event: KILL
> 2015-07-20 19:46:20,970 WARN
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Invalid resource ask by application appattempt_1437385548409_0005_000001
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid
> resource request, queue=b1 doesn't have permission to access all labels in
> resource request. labelExpression of resource request=x. Queue labels=y
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:250)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:106)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:515)
> at
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
> at
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168)
> {code}
> Same exception will be thrown till *heartbeat timeout*
> Then application state will be updated to *FAILED*
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)