[ 
https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293751#comment-17293751
 ] 

VADAGA ANANYO RAO commented on YARN-10663:
------------------------------------------

Recap of how actual impl code handles running and finished apps on each node:
 # Each RMNode has a list of runningApplications. Running Apps are active apps 
which have run some container on that node.
 # Each RMAppImpl maintains a copy `ranNodes` which are all the nodes on which 
the app has run containers.
 # When the app is at its `FinalTransition`, the app iterates over all the 
ranNodes and triggers a `RMNodeCleanupAppEvent` for that node.
 # RMNodeImpl handles RMNodeCleanupAppEvent by removing apps from 
`runningApplications` list to `finishedApplications` list.

Based on this flow, I plan to:
 # Add a `ranNodes` list in AMSimulator.
 # Each time AMSimulator starts a container on a node (NMSimulator), we will:
 ## update the runningApps in the NMSimulator and,
 ## update the ranNodes in the AMSimulator
 # When the app is finishing, for each node in ranNodes list in AMSimulator, we 
will remove the app from the runningApps list of that node.

> Add runningApps stats in SLS
> ----------------------------
>
>                 Key: YARN-10663
>                 URL: https://issues.apache.org/jira/browse/YARN-10663
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: yarn
>            Reporter: VADAGA ANANYO RAO
>            Assignee: VADAGA ANANYO RAO
>            Priority: Major
>         Attachments: YARN-10663.0001.patch
>
>
> RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
> graceful decommissioning logic takes a hit as the nodes will decommission if 
> there are no running containers on the node but some shuffle data was present 
> on the node.
> In this Jira, we will add runningApps functionality in SLS for improving 
> decommissioning logic of each node. This will help with autoscaling 
> simulations on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to