[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996675#comment-13996675 ]
Bikas Saha commented on YARN-2052: ---------------------------------- The RM identifier is effectively the epoch for the RM. We already use it in the NM to differentiate between allocations made by old RM vs the new RM. Using the appId in the container id prevents us from using this epoch number since the appId cannot change across restarts for containers belonging to the same app. That will be backwards incompatible. Another alternative would be to replace the monotonically increasing sequence number with a unique identifier like a UUID. But that is also incompatible. Another alternative is to create another epoch number for the RM in addition to the cluster timestamp. The monotonically increasing sequence could be a combination (concatenation) of the new epoch number and the sequence number. e.g. container_XXX_1000 after epoch 1. When the epoch number is 0 then we can drop the epoch number and things look the same as today. e.g. container_XXX_000. > ContainerId creation after work preserving restart is broken > ------------------------------------------------------------ > > Key: YARN-2052 > URL: https://issues.apache.org/jira/browse/YARN-2052 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Tsuyoshi OZAWA > > Container ids are made unique by using the app identifier and appending a > monotonically increasing sequence number to it. Since container creation is a > high churn activity the RM does not store the sequence number per app. So > after restart it does not know what the new sequence number should be for new > allocations. -- This message was sent by Atlassian JIRA (v6.2#6252)