Sangjin Lee commented on YARN-3981:

Some of us had an offline discussion on this. There are some major challenges 
in supporting this in the v.2 design. First, obviously they may lack an 
application-specific context as they can span multiple YARN apps. Second, even 
if we solved the problem of the context, these clients are likely off-cluster, 
and they need a way to write to the cluster. Ideas such as a separate dedicated 
timeline writer just for these have been discussed, but their scalability is 
problematic at best.

 One idea that was suggested involves creating a specialized YARN application 
that can act as a proxy for these off-cluster clients. For example, suppose you 
started a tez client that can start multiple YARN apps. It can also start a 
special dedicated "(flow-level) timeline client". This client would launch a 
special YARN app under the covers whose app master and its associated timeline 
writer can serve as the proxy for timeline data the client may write. When this 
special timeline client shuts down, it would tear down the associated YARN app 

If we go this route, we would write the YARN app itself so that the app master 
listens on requests coming from the client and proxies it to the timeline 
writer. We would also write the timeline client piece so that it manages the 
YARN app as well as sending the write requests to the app master.

> support timeline clients not associated with an application
> -----------------------------------------------------------
>                 Key: YARN-3981
>                 URL: https://issues.apache.org/jira/browse/YARN-3981
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
> In the current v.2 design, all timeline writes must belong in a 
> flow/application context (cluster + user + flow + flow run + application).
> But there are use cases that require writing data outside the context of an 
> application. One such example is a higher level client (e.g. tez client or 
> hive/oozie/cascading client) writing flow-level data that spans multiple 
> applications. We need to find a way to support them.

This message was sent by Atlassian JIRA

Reply via email to