[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2901:
--------------------------------
    Attachment: apache-yarn-2901.2.patch

Thanks for the review [~leftnoteasy]!

{quote}
1.1 Better to place in yarn-server-common?
1.2 If you agree above, how about put into package o.a.h.y.server.metrics (or 
utils)?
{quote}

I'd prefer not to move it. All the common web ui classes(for the existing web 
ui) are in hadoop-yarn-common and I'll have to move everything over to 
hadoop-yarn-server-common.

bq. 1.3 Rename it to Log4jWarnErrorMetricsAppender?

Fixed.

{cutoff}
1.4 Comments about implementation:
I think currently, implementation of cleanup can be improved, now cutoff 
process of message/count is basically loop all items stored, which could be 
inefficient (imaging if number of stored message > threshold), existing logics 
in the patch would lead to lots of potential stored message (tons of messages 
could be genereated in 5 min, which is purge message task run interval).
{quote}

Changed the purge implementation. I maintain a purge information structure that 
makes purging more efficient.

I've also added the appender information to log4j.properties so that the 
appender can be enabled/disabled using YARN_ROOT_LOGGER.

> Add errors and warning stats to RM, NM web UI
> ---------------------------------------------
>
>                 Key: YARN-2901
>                 URL: https://issues.apache.org/jira/browse/YARN-2901
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
> Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
> apache-yarn-2901.1.patch, apache-yarn-2901.2.patch
>
>
> It would be really useful to have statistics on the number of errors and 
> warnings in the RM and NM web UI. I'm thinking about -
> 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
> 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
> hours/day
> By errors and warnings I'm referring to the log level.
> I suspect we can probably achieve this by writing a custom appender?(I'm open 
> to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to