[
https://issues.apache.org/jira/browse/YARN-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453991#comment-15453991
]
Subru Krishnan edited comment on YARN-5601 at 9/1/16 1:43 AM:
--------------------------------------------------------------
[~jianhe], to answer your question let me start with why we need epoch in a
federated cluster:
currently only a single RM generates containerIDs (applicationID + a sequence
number) but in a federated cluster, there are multiple RMs that are
concurrently generating them. So there will be conflicts if an application
spans across multiple sub-clusters. To avoid this conflict, we use epoch in a
federated cluster similar to how it's used in the context of work preserving
restarts to prevent conflicts.
The idea is we will set epoch number to be 0 for first sub-cluster RM, 10000
for second sub-cluster RM, 20000 for third sub-cluster RM, etc. This should be
sufficient as we have 1M epochs as they are represented as a 20bit integer.
With this, there will be a conflict of containerIDs only if *all* of the below
conditions are satisfied:
# The RM of sub-cluster 1 is rebooted over 10000 times
# There is a running App the is still running (during over 10k reboots of one
of the RMs)
# The app is run across sub-cluster 1 and sub-cluster 2
# The app is still holding onto containers from sub-cluster 2 issued from
the first reboot of that sub-cluster
# The containers have Ids low enough that the newly issued containers from
RM1 clash
Makes sense?
was (Author: subru):
[~jianhe], to answer your question let me start with why we need epoch in a
federated cluster:
currently only a single RM generates containerIDs (applicationID + a sequence
number) but in a federated cluster, there are multiple RMs that are
concurrently generating them. So there will be conflicts if an application
spans across multiple sub-clusters. To avoid this conflict, we use epoch in a
federated cluster similar to how it's used in the context of work preserving
restarts to prevent conflicts.
The idea is we will set epoch number to be 0 for first sub-cluster RM, 10000
for second sub-cluster RM, 20000 for third sub-cluster RM, etc. This should be
sufficient as we have 1M epochs as they are represented as a 20bit integer.
With this, there will be a conflict of containerIDs only if *all* of the below
conditions are satisfied:
1) The RM of sub-cluster 1 is rebooted over 10000 times
2) There is a running App the is still running (during over 10k reboots of
one of the RMs)
3) The app is run across sub-cluster 1 and sub-cluster 2
4) The app is still holding onto containers from sub-cluster 2 issued from
the first reboot of that sub-cluster
5) The containers have Ids low enough that the newly issued containers from
RM1 clash
Makes sense?
> Make the RM epoch base value configurable
> -----------------------------------------
>
> Key: YARN-5601
> URL: https://issues.apache.org/jira/browse/YARN-5601
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Reporter: Subru Krishnan
> Assignee: Subru Krishnan
> Attachments: YARN-5601-YARN-2915-v1.patch
>
>
> Currently the epoch always starts from zero. This can cause container ids to
> conflict for an application under Federation that spans multiple RMs
> concurrently. This JIRA proposes to make the RM epoch base value configurable
> which will allow us to avoid conflicts by setting different values for each
> RM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]