[ https://issues.apache.org/jira/browse/YARN-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879964#comment-15879964 ]
Sunil G commented on YARN-6207: ------------------------------- [~bibinchundatt], thanks for the patch. Had an offline discussion with [~rohithsharma], we were expecting something like below. {code} FiCaSchedulerApp app = application.getCurrentAppAttempt(); if (app != null) { // Move all live containers even when stopped. // For transferStateFromPreviousAttempt required for (RMContainer rmContainer : app.getLiveContainers()) { source.detachContainer(getClusterResource(), app, rmContainer); // attach the Container to another queue dest.attachContainer(getClusterResource(), app, rmContainer); } if (!app.isStopped()) { source.finishApplicationAttempt(app, sourceQueueName); // Submit to a new queue dest.submitApplicationAttempt(app, user); // Finish app & update metrics app.move(dest); } source.appFinished(); source.getParent().finishApplication(appId, user); } application.setQueue(dest); LOG.info("App: " + appId + " successfully moved from " + sourceQueueName + " to: " + destQueueName); return targetQueueName; {code} Reasons behind this proposal. # {{source.finishApplication(appId, user);}} is not needed as {{AppSchedulingInfo.move}} is updating {{abstractUsersManager.deactivateApplication(user, applicationId);}}. So we jus need to invoke appFinished and inform parent. Hence those two lines are added. # {{app.move}} need to be inside {{!app.isStopped()}} check. Because if app is stopped, we ensure that all running and reserved containers are invoked with completedContainer call. Apart from this, {{app != null}} check need not have to throw exception. Any way app is done, so do we need to bomb to client? > Move application can fail when attempt add event is delayed > ------------------------------------------------------------ > > Key: YARN-6207 > URL: https://issues.apache.org/jira/browse/YARN-6207 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Reporter: Bibin A Chundatt > Assignee: Bibin A Chundatt > Attachments: YARN-6207.001.patch, YARN-6207.002.patch, > YARN-6207.003.patch, YARN-6207.004.patch > > > *Steps to reproduce* > 1.Submit application and delay attempt add to Scheduler > (Simulate using debug at EventDispatcher for SchedulerEventDispatcher) > 2. Call move application to destination queue. > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.preValidateMoveApplication(CapacityScheduler.java:2086) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.moveApplicationAcrossQueue(RMAppManager.java:669) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.moveApplicationAcrossQueues(ClientRMService.java:1231) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.moveApplicationAcrossQueues(ApplicationClientProtocolPBServiceImpl.java:388) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:537) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659) > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1483) > at org.apache.hadoop.ipc.Client.call(Client.java:1429) > at org.apache.hadoop.ipc.Client.call(Client.java:1339) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:115) > at com.sun.proxy.$Proxy7.moveApplicationAcrossQueues(Unknown Source) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.moveApplicationAcrossQueues(ApplicationClientProtocolPBClientImpl.java:398) > ... 16 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org