[ https://issues.apache.org/jira/browse/YARN-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576519#comment-14576519 ]
zhihai xu commented on YARN-3017: --------------------------------- Hi [~rohithsharma], It looked like the above situation for rolling upgrade may not be an issue. I looked at the code: The NM reports running containers to RM using either {{NMContainerStatus}} in {{registerNodeManager}} or {{ContainerStatus}} in {{nodeHeartbeat}}. {{NMContainerStatus}} is built from {{ContainerImpl#getNMContainerStatus}} and {{ContainerStatus}} is built from {{ContainerImpl#cloneAndGetContainerStatus}}. I didn't find these will be affected by {{ContainerId#toString}} except {{ContainerImpl#diagnostics}}. But {{ContainerImpl#diagnostics}} is only used for debug purpose and it won't cause any problem in RM. The container id is originally generated at createContainer in RM {code} ContainerId containerId = BuilderUtils.newContainerId(getApplicationAttemptId(), getNewContainerId()); {code} It is passed to NM in ContainerTokenIdentifier, which is decoded by NM using {{BuilderUtils.newContainerTokenIdentifier(request.getContainerToken());}} It looks like this patch won't affect ContainerTokenIdentifier. It is my understanding, and please correct me if I am wrong. > ContainerID in ResourceManager Log Has Slightly Different Format From > AppAttemptID > ---------------------------------------------------------------------------------- > > Key: YARN-3017 > URL: https://issues.apache.org/jira/browse/YARN-3017 > Project: Hadoop YARN > Issue Type: Improvement > Affects Versions: 2.8.0 > Reporter: MUFEED USMAN > Priority: Minor > Labels: PatchAvailable > Attachments: YARN-3017.patch, YARN-3017_1.patch, YARN-3017_2.patch > > > Not sure if this should be filed as a bug or not. > In the ResourceManager log in the events surrounding the creation of a new > application attempt, > ... > ... > 2014-11-14 17:45:37,258 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching > masterappattempt_1412150883650_0001_000002 > ... > ... > The application attempt has the ID format "_1412150883650_0001_000002". > Whereas the associated ContainerID goes by "_1412150883650_0001_02_". > ... > ... > 2014-11-14 17:45:37,260 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting > up > container Container: [ContainerId: container_1412150883650_0001_02_000001, > NodeId: n67:55933, NodeHttpAddress: n67:8042, Resource: <memory:2048, > vCores:1, > disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: > 10.10.70.67:55933 }, ] for AM appattempt_1412150883650_0001_000002 > ... > ... > Curious to know if this is kept like that for a reason. If not while using > filtering tools to, say, grep events surrounding a specific attempt by the > numeric ID part information may slip out during troubleshooting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)