[
https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642150#comment-14642150
]
Naganarasimha G R commented on YARN-3045:
-----------------------------------------
Hi [~djp]
bq. 1.what we want to differentiate here is what kind of events are critical
(so writer client in TimelineCollector could flush to backend storage after
written them) and what kinds of events are not so critical
Well was aware that priority was not to differentiate the containers but for
the events of it, but i thought you mentioned for the purpose of better
querying rather than the purpose of writing it. I have not gone through the
writer code completely but is there any caching which you want to flush if the
event priority is high ? Also was thinking whether we need to change the
Writer/Collector API to mention the criticality of the event being published?
bq. From an initiative thinking, some important app/container events include:
INIT_APPLICATION, INIT_CONTAINER, FINISH_APPLICATION,
APPLICATION_CONTAINER_FINISHED, APPLICATION_LOG_HANDLING_FAILED, while
unimportant events could include: APPLICATION_INITED,
APPLICATION_RESOURCES_CLEANEDUP, APPLICATION_LOG_HANDLING_INITED,
APPLICATION_LOG_HANDLING_FINISHED, etc.
So from NM side we want to publish events for ApplicationEntity and
ContainerEntity, but based on the title of this jira i thought scope of this
jira is to handle only ContainerEntities from NM side, is it better to handle
events related Application entities specific to a given NM in another Jira? but
i can try to ensure required foundation is done in NM side in this jira as part
of your other comments, Thoughts?
Also event has just id but NM related Application events will have the same
event ID in different NM's so would it be something like
{{INIT_APPLICATION_<NODE_ID>}} ?
bq. 2. We should have some handy method to turn these app/container events to
TimelineEvent and publish these events in a consensus way rather than publish
one type of event with one method.
bq. 3. We don't need to create new container events but should log existing
YARN app/container events that happen in NM. If we really think some important
events are missing in YARN, we can have futher discussions later after timeline
service v2 in good shape.
+1 for this thought, had the same initial hitch as in future if we add more
events than unnecessary create event and methods in publisher, but for the
initial version thought will have approach similar to RM and ATSV1. But i feel
better to handle now than refactor later on. But i can think of couple of
approaches here
# Approach as you mentioned inside the app/container transitions in the NM side
publish the event containing the container/app information. May be in some
cases like creation of app or container caller can publish the events (like
Container created so as to capture the creation time rather than )
# In ContainerEventDispatcher,ApplicationEventDispatcher & rsrcLocalizationSrvc
after handling it can by default call different handlers of
NMTimeLinePublisher(inner classes) to handle the respective events. Specific
req events can be handled and others can be just ignored.
# Source itself can create the entity and the event object and
NMTimelinePublisher can expose a method to take timeline objects add it to
Async Dispatcher and event handler will just call the client to publish the
event/entity.
bq. 4. It looks like NMTimelinePublisher should be used by ContainerManager,
Container, ResourceLocalizationService and Log Handler. Move it to NMContext
should be convenient to use for other components.
Will take care based on the approach we take as per prev step.
bq. 5. Container Resource Usage event may not be necessary given we already
have metrics update and will do aggregation according to metrics update.bq.
1.what we want to differentiate here is what kind of events are critical (so
writer client in TimelineCollector could flush to backend storage after written
them) and what kinds of events are not so critical
Was not clear about the comment, IIRC Zhijjie in the meeting also mentioned
that i am handling removing threaded model of publishing container metrics
statistics as part of this jira. May be i am missing some other jira which you
are already working on, may be can you englighten me about it ?
> [Event producers] Implement NM writing container lifecycle events to ATS
> ------------------------------------------------------------------------
>
> Key: YARN-3045
> URL: https://issues.apache.org/jira/browse/YARN-3045
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Sangjin Lee
> Assignee: Naganarasimha G R
> Attachments: YARN-3045-YARN-2928.002.patch,
> YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch,
> YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch,
> YARN-3045.20150420-1.patch
>
>
> Per design in YARN-2928, implement NM writing container lifecycle events and
> container system metrics to ATS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)