[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356418#comment-14356418
 ] 

Karthik Kambatla commented on YARN-3332:
----------------------------------------

bq. the machine level big picture is fragmented between YARN and HDFS (and 
HBase etc)
What constitutes the machine level big picture? Isn't this just the overall 
node's resource usage? YARN, at least as of today, doesn't need to know about 
the usage stats of HDFS or HBase. 

I have nothing against going the server route, except the additional daemon one 
might end up having to run.

bq. I anyways needed a service to expose an API for both admins/users as well 
as external systems beyond HDFS too - I can imagine tools being built on top of 
this.
It is not as clear to me. Let us say an admin and a user want usage stats about 
their YARN containers. The service can only provide the usage stats, while YARN 
will be able to provide other container metadata. Also, we should consider 
privacy of usage information. Will auth against this new service be additional 
overhead? 

bq. That said, it doesn't need to be service or library. I can think of a 
library that wires into the exposed API, though I haven't found uses for that 
yet.
Sorry, didn't get that. Can you clarify/ elaborate? 

> [Umbrella] Unified Resource Statistics Collection per node
> ----------------------------------------------------------
>
>                 Key: YARN-3332
>                 URL: https://issues.apache.org/jira/browse/YARN-3332
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: Design - UnifiedResourceStatisticsCollection.pdf
>
>
> Today in YARN, NodeManager collects statistics like per container resource 
> usage and overall physical resources available on the machine. Currently this 
> is used internally in YARN by the NodeManager for only a limited usage: 
> automatically determining the capacity of resources on node and enforcing 
> memory usage to what is reserved per container.
> This proposal is to extend the existing architecture and collect statistics 
> for usage b​eyond​ the existing use­cases.
> Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to