[
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861741#comment-16861741
]
Tao Yang commented on YARN-8995:
--------------------------------
Thanks [~zhuqi] for updating the patch.
Comments about the new patch:
* For the latest event, I didn't mean that it should be control separately from
the counter info, we can add a boolean flag defaults to false, which can be
updated to true when triggering to print the details (for example queue size
has reached N*5000) and to false after latest event has already been printed.
* Configuration reading logic should be moved to serviceStart() for better
performance.
* The printEventQueueDetails method can be simplified via stream API, moreover,
value type of counterMap should use Long instead of long[].
* The new configuration entry should have a clear name, for example
"yarn.dispatcher.print-events-debug-info.interval-in-thousands" in a random
think, you can give a better name for it. I suppose we should take thousands as
the unit since the print switch is due to another condition (qSize % 1000 == 0).
> Log the event type of the too big AsyncDispatcher event queue size, and add
> the information to the metrics.
> ------------------------------------------------------------------------------------------------------------
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: metrics, nodemanager, resourcemanager
> Affects Versions: 3.2.0
> Reporter: zhuqi
> Assignee: zhuqi
> Priority: Major
> Attachments: YARN-8995.001.patch, YARN-8995.002.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event
> queues to block the performance of the cluster, such as the bug of
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to
> log the event type of the too big event queue size, and add the information
> to the metrics, and the threshold of queue size is a parametor which can be
> changed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]