[
https://issues.apache.org/jira/browse/YARN-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605879#comment-14605879
]
Advertising
Allen Wittenauer edited comment on YARN-3819 at 6/29/15 4:58 PM:
-----------------------------------------------------------------
Yes, I recognize that you are building on an already established framework
that, prior, only collected metrics that were specific to YARN. But now with
network and disk, the collection details are such that all of the sub-projects
could benefit. It's shortsighted to build something that could not very
easily be used by all.
That said, the data collection code should be done in a generic way such that,
in the future, HDFS could plug into the same collection classes so that it too
may make block scheduling decisions. (This has been a discussion point around
the HDFS community for a while). YARN could then call those methods that
gather the data into its own framework to do whatever it needs to do.
So while the framework is obviously different the actual work, of e.g. "how do
I know the IO stats on file system X", should be in common.
It could be argued that the previous bits that are also being collected should
be in common, but that's already shipped. Let's not repeat past mistakes
though.
was (Author: aw):
Yes, I recognize that you are building on an already established framework
that, prior, only collected metrics that were specific to YARN. But now with
network and disk, the collection details are such that all of the sub-projects
could benefit. It's shortsighted to build something that could very easily
be used by all.
That said, the data collection code should be done in a generic way such that,
in the future, HDFS could plug into the same collection classes so that it too
may make block scheduling decisions. (This has been a discussion point around
the HDFS community for a while). YARN could then call those methods that
gather the data into its own framework to do whatever it needs to do.
So while the framework is obviously different the actual work, of e.g. "how do
I know the IO stats on file system X", should be in common.
It could be argued that the previous bits that are also being collected should
be in common, but that's already shipped. Let's not repeat past mistakes
though.
> Collect network usage on the node
> ---------------------------------
>
> Key: YARN-3819
> URL: https://issues.apache.org/jira/browse/YARN-3819
> Project: Hadoop YARN
> Issue Type: New Feature
> Affects Versions: 3.0.0
> Reporter: Robert Grandl
> Assignee: Robert Grandl
> Labels: yarn-common, yarn-util
> Attachments: YARN-3819-1.patch, YARN-3819-2.patch, YARN-3819-3.patch,
> YARN-3819-4.patch, YARN-3819-5.patch
>
>
> In this JIRA we propose to collect the network usage on a node. This JIRA is
> part of a larger effort of monitoring resource usages on the nodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)