Sunil G commented on YARN-4091:

Thank you [~leftnoteasy] for sharing the thoughts.

Yes. the REST framework looks fine. But after the first response update as 
"pending fetching", a second REST query has to be done to see the real result. 
Or we can dump this information as logs. I feel getting information back as 
REST o/p is more better and we utilize this framework in new UI.  Hence timing 
of the second REST query is important as the intended node heartbeat has to 
happen (or by the time query comes, more heartbeats from same node would have 
come). Showing an aggregate debug information till second query is good, but I 
fear about the load on RM and the data produced. With a timelimit (or min count 
of number of heartbeats to debug) can help in this case. Thoughts?

> Improvement: Introduce more debug/diagnostics information to detail out 
> scheduler activity
> ------------------------------------------------------------------------------------------
>                 Key: YARN-4091
>                 URL: https://issues.apache.org/jira/browse/YARN-4091
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler, resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: Improvement on debugdiagnostic information - YARN.pdf
> As schedulers are improved with various new capabilities, more configurations 
> which tunes the schedulers starts to take actions such as limit assigning 
> containers to an application, or introduce delay to allocate container etc. 
> There are no clear information passed down from scheduler to outerworld under 
> these various scenarios. This makes debugging very tougher.
> This ticket is an effort to introduce more defined states on various parts in 
> scheduler where it skips/rejects container assignment, activate application 
> etc. Such information will help user to know whats happening in scheduler.
> Attaching a short proposal for initial discussion. We would like to improve 
> on this as we discuss.

This message was sent by Atlassian JIRA

Reply via email to