Tsuyoshi OZAWA commented on YARN-2229:

Thanks for the comments, Jian, Zhijie and Sid.

For example, ContainerTokenIdentifier serializes a long (getContainerId()) at 
RM side, but deserializes a int (getId()) at NM side. In this case, I'm afraid 
it's going to be wrong

If we think the backward compatibility as first priority, we can choose the 
first design I proposed as Sid mentioned. This design choice looks reasonable 
to me. [~jianhe], what do you think? We discussed that we should avoid 
introducing new field to ContinerId class. In my opinion, this reason is weaker 
than the backward compatibility.

ConverterUtils is a separate consideration. It is marked as @private - but is 
used in MapReduce for example (and also in Tez). Looks like the toString method 
isn't being changed either, whcih means to ConverterUtils method would continue 
to work.

I'm thinking to suffix the epoch at the end of container id. I'll work with old 
jar which includes old {{ConverterUtils#toContainerId}}. YARN-2182 is the JIRA 
to address the change of {{ConverterUtils#toContainerId}}.

> ContainerId can overflow with RM restart
> ----------------------------------------
>                 Key: YARN-2229
>                 URL: https://issues.apache.org/jira/browse/YARN-2229
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2229.1.patch, YARN-2229.10.patch, 
> YARN-2229.10.patch, YARN-2229.2.patch, YARN-2229.2.patch, YARN-2229.3.patch, 
> YARN-2229.4.patch, YARN-2229.5.patch, YARN-2229.6.patch, YARN-2229.7.patch, 
> YARN-2229.8.patch, YARN-2229.9.patch
> On YARN-2052, we changed containerId format: upper 10 bits are for epoch, 
> lower 22 bits are for sequence number of Ids. This is for preserving 
> semantics of {{ContainerId#getId()}}, {{ContainerId#toString()}}, 
> {{ContainerId#compareTo()}}, {{ContainerId#equals}}, and 
> {{ConverterUtils#toContainerId}}. One concern is epoch can overflow after RM 
> restarts 1024 times.
> To avoid the problem, its better to make containerId long. We need to define 
> the new format of container Id with preserving backward compatibility on this 

This message was sent by Atlassian JIRA

Reply via email to