[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

Bikas Saha (JIRA) Tue, 13 May 2014 10:43:26 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996675#comment-13996675
 ]


Bikas Saha commented on YARN-2052:
----------------------------------

The RM identifier is effectively the epoch for the RM. We already use it in the 
NM to differentiate between allocations made by old RM vs the new RM. Using the 
appId in the container id prevents us from using this epoch number since the 
appId cannot change across restarts for containers belonging to the same app. 
That will be backwards incompatible.
Another alternative would be to replace the monotonically increasing sequence 
number with a unique identifier like a UUID. But that is also incompatible.
Another alternative is to create another epoch number for the RM in addition to 
the cluster timestamp. The monotonically increasing sequence could be a 
combination (concatenation) of the new epoch number and the sequence number. 
e.g. container_XXX_1000 after epoch 1. When the epoch number is 0 then we can 
drop the epoch number and things look the same as today. e.g. container_XXX_000.

> ContainerId creation after work preserving restart is broken
> ------------------------------------------------------------
>
>                 Key: YARN-2052
>                 URL: https://issues.apache.org/jira/browse/YARN-2052
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>
> Container ids are made unique by using the app identifier and appending a 
> monotonically increasing sequence number to it. Since container creation is a 
> high churn activity the RM does not store the sequence number per app. So 
> after restart it does not know what the new sequence number should be for new 
> allocations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

Reply via email to