[
https://issues.apache.org/jira/browse/YARN-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402932#comment-15402932
]
Eric Payne commented on YARN-4091:
----------------------------------
Thanks, [~leftnoteasy], for the helpful explanation. I see now that the
expensive parts of the code are done inside {{ActivitiesManager}}, and that
those are protected by the {{shouldRecordThis*}} checks. This feature does
still add several new calls per second to the node heartbeat and container
allocation paths regardless. Even though these calls are all normally very
fast, these are critical paths, and so that is why I am concerned about
performance. It sounds like you have performed due diligence in the area
surrounding these calls, so it will probably not have much impact (I hope).
[~ChenGe], I do have one other comment. I notice that the "{{priority}}" key in
the output is kind of ambiguous. It may be difficult for some to differentiate
between app priority and container priority. For example:
{code}
{
"nodeId":"hostname.company.com:45454",
"queueName":"default",
"priority":"0",
...
"allocationAttempt":
[
{
"priority":"0",
"allocationState":"SKIPPED",
"diagnostic":"priority skipped"
},
{
"name":"container_e03_1470083952204_0001_01_000103",
"priority":"20",
"allocationState":"ALLOCATED"
}
]
},
{code}
Thanks!
> Add REST API to retrieve scheduler activity
> -------------------------------------------
>
> Key: YARN-4091
> URL: https://issues.apache.org/jira/browse/YARN-4091
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacity scheduler, resourcemanager
> Affects Versions: 2.7.0
> Reporter: Sunil G
> Assignee: Chen Ge
> Attachments: Improvement on debugdiagnostic information - YARN.pdf,
> SchedulerActivityManager-TestReport v2.pdf,
> SchedulerActivityManager-TestReport.pdf, YARN-4091-design-doc-v1.pdf,
> YARN-4091.1.patch, YARN-4091.2.patch, YARN-4091.3.patch, YARN-4091.4.patch,
> YARN-4091.5.patch, YARN-4091.5.patch, YARN-4091.preliminary.1.patch,
> app_activities.json, node_activities.json
>
>
> As schedulers are improved with various new capabilities, more configurations
> which tunes the schedulers starts to take actions such as limit assigning
> containers to an application, or introduce delay to allocate container etc.
> There are no clear information passed down from scheduler to outerworld under
> these various scenarios. This makes debugging very tougher.
> This ticket is an effort to introduce more defined states on various parts in
> scheduler where it skips/rejects container assignment, activate application
> etc. Such information will help user to know whats happening in scheduler.
> Attaching a short proposal for initial discussion. We would like to improve
> on this as we discuss.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]