[ 
https://issues.apache.org/jira/browse/YARN-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819800#comment-13819800
 ] 

Karthik Kambatla commented on YARN-1390:
----------------------------------------

[~zjshen], [~hitesh]: thanks again for your inputs. I believe we are all mostly 
in agreement. 

In the longer term, I envision having tags and realizing other fields like 
applicationType, lineage information. Further, it would be nice to index the 
apps by these tags, so we don't have to iterate through all the applications 
and filter everytime we query the RM.

bq. How do you expect someone to search for all mapreduce jobs? Do a substring 
search?
Representing applicationType as a set should suffice. To check if an app is an 
MR job, one should be able to just do applicationType.contains("MAPREDUCE"). 
All components - JHS, AHS, Java / REST APIs - should be do make this small 
adjustment to continue working the way they do today. 

However, I do agree that enforcing applicationType of a YARN application 
contains *exactly one* of {Tez, MAPREDUCE, Storm, Spark} might lead to slight, 
albeit unnecessary complication. Given that, do we have consensus that we need 
applicationSource/ applicationLineage / tags? If so, what is the preferred name 
for this new field? [~hitesh], [~zjshen], [~vinodkv] - thoughts?




> Provide a way to capture source of an application to be queried through REST 
> or Java Client APIs
> ------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1390
>                 URL: https://issues.apache.org/jira/browse/YARN-1390
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: api
>    Affects Versions: 2.2.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>
> In addition to other fields like application-type (added in YARN-563), it is 
> useful to have an applicationSource field to track the source of an 
> application. The application source can be useful in (1) fetching only those 
> applications a user is interested in, (2) potentially adding source-specific 
> optimizations in the future. 
> Examples of sources are: User-defined project names, Pig, Hive, Oozie, Sqoop 
> etc.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to