[ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13571832#comment-13571832
 ] 

Thomas Graves commented on YARN-370:
------------------------------------

Ok, I figured out what the difference is. The submission context am container 
spec is being updated on the call in RMAppAttempImpl to 
appAttempt.scheduler.allocate.  In 0.23 in normalizeRequests it modifies the 
resource directly:
    ask.getCapability().setMemory(


Whereas in 2.0 it sets the capability so it doesn't modify the original 
resource that was based in up in the RMAppAttemptImpl:
ask.setCapability(normalized);

Hacking a quick fix in and changing 2.X to set 
ask.getCapability().setMemory(normalized.getMemory()) instead of calling 
setCapability in normalRequest fixes the issue. 


                
> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> ----------------------------------------------------------------------------------
>
>                 Key: YARN-370
>                 URL: https://issues.apache.org/jira/browse/YARN-370
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.0.3-alpha
>            Reporter: Thomas Graves
>            Assignee: Zhijie Shen
>            Priority: Blocker
>         Attachments: YARN-370-branch-2.patch
>
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_000001. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource <memory:2048, 
> vCores:1> but found <memory:1536, vCores:1> at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to