Allen Wittenauer commented on YARN-3819:

Yes, I recognize that you are building on an already established framework 
that, prior, only collected metrics that were specific to YARN.  But now with 
network and disk, the collection details are such that all of the sub-projects 
could benefit.    It's shortsighted to build something that could very easily 
be used by all.

That said, the data collection code should be done in a generic way such that, 
in the future, HDFS could plug into the same collection classes so that it too 
may make block scheduling decisions. (This has been a discussion point around 
the HDFS community for a while).  YARN could then call those methods that 
gather the data into its own framework to do whatever it needs to do.

So while the framework is obviously different the actual work, of e.g. "how do 
I know the IO stats on file system X", should be in common.  

It could be argued that the previous bits that are also being collected should 
be in common, but that's already shipped.  Let's not repeat past mistakes 

> Collect network usage on the node
> ---------------------------------
>                 Key: YARN-3819
>                 URL: https://issues.apache.org/jira/browse/YARN-3819
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 3.0.0
>            Reporter: Robert Grandl
>            Assignee: Robert Grandl
>              Labels: yarn-common, yarn-util
>         Attachments: YARN-3819-1.patch, YARN-3819-2.patch, YARN-3819-3.patch, 
> YARN-3819-4.patch, YARN-3819-5.patch
> In this JIRA we propose to collect the network usage on a node. This JIRA is 
> part of a larger effort of monitoring resource usages on the nodes. 

This message was sent by Atlassian JIRA

Reply via email to