Tsuyoshi OZAWA commented on YARN-2052:

We should make it a long in the same release as the epoch number addition so 
that we dont have to worry about that.

+1 to do this in the same release. We'll plan to do the improvement on another 
JIRA. It's OK, but I think it's important for us that we decide the behavior 
when the overflow happens. We have 2 options: just aborting RM for now or 
starting apps from a clean state after the restart. We're planning to make id 
long just after this JIRA, so we can take aborting approach to prevent 
unexpected behavior for the simplicity. [~bikassaha], [~jianhe], what do you 
think about this?

> ContainerId creation after work preserving restart is broken
> ------------------------------------------------------------
>                 Key: YARN-2052
>                 URL: https://issues.apache.org/jira/browse/YARN-2052
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2052.1.patch, YARN-2052.2.patch, YARN-2052.3.patch
> Container ids are made unique by using the app identifier and appending a 
> monotonically increasing sequence number to it. Since container creation is a 
> high churn activity the RM does not store the sequence number per app. So 
> after restart it does not know what the new sequence number should be for new 
> allocations.

This message was sent by Atlassian JIRA

Reply via email to