Junping Du commented on YARN-3045:

bq. Well was aware that priority was not to differentiate the containers but 
for the events of it, but i thought you mentioned for the purpose of better 
querying rather than the purpose of writing it.
Better query is one of purpose but writing them in different policies could 
also be a consideration here. We may not afford to flush every events in a 
large scale cluster, so we may choose to ignore/cache some unimportant ones. 

bq. I have not gone through the writer code completely but is there any caching 
which you want to flush if the event priority is high ? Also was thinking 
whether we need to change the Writer/Collector API to mention the criticality 
of the event being published?
We already have a new flush() API now for writer that checked in YARN-3949. 
Please refer some of discussions there with details. You are right that we are 
lacking of API to respect this priority/policy in the whole data flow for 
writing. I will file another JIRA to track this.

bq. So from NM side we want to publish events for ApplicationEntity and 
ContainerEntity, but based on the title of this jira i thought scope of this 
jira is to handle only ContainerEntities from NM side, is it better to handle 
events related Application entities specific to a given NM in another Jira? but 
i can try to ensure required foundation is done in NM side in this JIRA as part 
of your other comments, Thoughts?
I am fine with separating events other than container events to a separated 
JIRA if it is really necessary. In common case, jira title shouldn't bound the 
implementation as at JIRA proposing time, there is no so concrete goal like 
when JIRA is being implemented so we can fix/adjust later. Anyway, I would 
support the scope (container events + foundation work) you proposed here in 
case you are comfortable with.

bq. Also event has just id but NM related Application events will have the same 
event ID in different NM's so would it be something like 
That's a good question. My initative thinking is we could need something like 
NodemanagerEntity to store application events, resource localizaiton event, log 
aggregation handling events, configuration, etc. However, I would like to hear 
you and other guys' ideas on this as well.

bq. +1 for this thought, had the same initial hitch as in future if we add more 
events than unnecessary create event and methods in publisher, but for the 
initial version thought will have approach similar to RM and ATSV1. But i feel 
better to handle now than refactor later on. But i can think of couple of 
approaches here.
Yes. All three approaches seems to work here. IMO, the 2nd approach (hook to 
existing event dispatcher) looks simpler and straightforward.

bq. Was not clear about the comment, IIRC Zhijjie in the meeting also mentioned 
that i am handling removing threaded model of publishing container metrics 
statistics as part of this jira. May be i am missing some other jira which you 
are already working on, may be can you englighten me about it?
I was thinking you are encapsulating metrics with TimelineEvent but actually 
not. So no worry on my previous comments on this.

> [Event producers] Implement NM writing container lifecycle events to ATS
> ------------------------------------------------------------------------
>                 Key: YARN-3045
>                 URL: https://issues.apache.org/jira/browse/YARN-3045
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3045-YARN-2928.002.patch, 
> YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, 
> YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, 
> YARN-3045.20150420-1.patch
> Per design in YARN-2928, implement NM writing container lifecycle events and 
> container system metrics to ATS.

This message was sent by Atlassian JIRA

Reply via email to