[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

Jian He (JIRA) Wed, 25 Jun 2014 14:28:19 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044059#comment-14044059
 ]


Jian He commented on YARN-2052:
-------------------------------

- we can update the structure graph in ZKRMStateStore to reflect the new epoch 
node too.
- this can be replaced with RMEpoch.newInstance(); and promote getProto to the 
parent class as ApplicationAttemptStateData does.
{code}
    RMEpochPBImpl pb = new RMEpochPBImpl();
    pb.setEpoch(epoch);
{code}
-  This was there only for a temporary fix. This can be removed given the 
change is made in this patch. The new containers allocated from new RM won’t 
collide with previous containers any more after this patch
{code}
// ContainerId is refreshed with epoch after RM restart.
    this.containerIdCounter.incrementAndGet();
{code}
- what will the ContainerId.toString() print after this patch ? is it more 
intuitive to parse out the epoch number and print the epoch+id ? may add 
comments for this new format on the “getId” method. 
- can you add comments on “public abstract int getId();” method and explain 
that first 10 bits are reserved for the number of RM restarts

> ContainerId creation after work preserving restart is broken
> ------------------------------------------------------------
>
>                 Key: YARN-2052
>                 URL: https://issues.apache.org/jira/browse/YARN-2052
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2052.1.patch, YARN-2052.2.patch, YARN-2052.3.patch, 
> YARN-2052.4.patch, YARN-2052.5.patch, YARN-2052.6.patch
>
>
> Container ids are made unique by using the app identifier and appending a 
> monotonically increasing sequence number to it. Since container creation is a 
> high churn activity the RM does not store the sequence number per app. So 
> after restart it does not know what the new sequence number should be for new 
> allocations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

Reply via email to