[
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911501#comment-16911501
]
Weiwei Yang edited comment on YARN-8995 at 8/20/19 4:08 PM:
------------------------------------------------------------
Hi [~zhuqi]/[~Tao Yang]
Thanks for working on this. Patch LGTM, I might be just a little picky on the
configuration name, right now it is not straightforward to me.
"The interval of queue size (in thousands) for printing the boom queue event
type details."
How about something like the following for the description, if I understand
this correctly:
"The threshold used to trigger the logging of event types and counts in RM's
main event dispatcher. Default length is 5000, which means RM will print events
info when the queue size cumulatively reaches 5000 every time. Such info can
be used to reveal what kind of events that RM is stuck at processing mostly, it
can help to narrow down certain performance issues."
And also, the config name is better to be something like
{{yarn.dispatcher.print-events-info.threshold}}, you don't need to use
in-thousands here, as several thousand is still human-readable.
Does that make sense?
Thanks
was (Author: cheersyang):
Hi [~zhuqi]/[~Tao Yang]
Thanks for working on this. Patch LGTM, I might be just a little picky on the
configuration name, right now it is not straightforward to me.
{noformat}
The interval of queue size (in thousands) for printing the boom queue event
type details.
{noformat}
How about something like the following for the description, if I understand
this correctly:
{noformat}
The threshold used to trigger the logging of event types and counts in RM's
main event dispatcher. Default length is 5000, which means RM will print events
info when the queue size cumulatively reaches 5000 every time. Such info can
be used to reveal what kind of events that RM is stuck at processing mostly, it
can help to narrow down certain performance issues.
{noformat}
And also, the config name is better to be something like
{{yarn.dispatcher.print-events-info.threshold}}, you don't need to use
in-thousands here, as several thousand is still human-readable.
Does that make sense?
Thanks
> Log the event type of the too big AsyncDispatcher event queue size, and add
> the information to the metrics.
> ------------------------------------------------------------------------------------------------------------
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: metrics, nodemanager, resourcemanager
> Affects Versions: 3.2.0, 3.3.0
> Reporter: zhuqi
> Assignee: zhuqi
> Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch,
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch,
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch,
> YARN-8995.008.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event
> queues to block the performance of the cluster, such as the bug of
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to
> log the event type of the too big event queue size, and add the information
> to the metrics, and the threshold of queue size is a parametor which can be
> changed.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]