Junping Du commented on YARN-3046:
Thanks [~zjshen] for review and comments!
bq. I'm not sure if we should have a MR config to determine is new or old
timeline service. If this MR config is set to true, but YARN cluster is still
setup with old timeline service. It still doesn't work.
Theoretically, the most beautiful solution is to let applications (MR, DS,
etc.) doesn't aware any version of timeline service. However, we already
decided to go with different methods/structures between v1 and v2 for
TimelineClient, so application have to be aware of which version timeline
service get used.
The next option is to let application figure out timeline related info from
YARN/RM, it can be done through registerApplicationMaster() in
ApplicationMasterProtocol with return value for service "off", "v1_on", or
The last option is as v1 patch shows which along the existing way for v1
service to enable timeline service in a separated configuration:
Personally, I would prefer the 2nd option. The reason is just like you
mentioned, application owner doesn't have to aware RM/YARN infrastructure
details. However, this need change to YARN AM protocol, and changes on
different applications (distributed shell, etc.) and mark existing MR
configuration deprecated (or it would have conflict in principle of similar
configurations). I would prefer to file a separated JIRA to track this more
carefully as this is important but not the focus of this JIRA's scope. What do
bq. Node need to have JobHistoryEventUtils, you can move util method to
JobHistoryUtils if you want.
I tried to do so before I created JobHistoryEventUtils. However, I found we
cannot do it because JobHistoryUtils is in hadoop-mapreduce-client-common
component, but some consumer of method is in hadoop-mapreduce-client-core
component (like: ReduceAttemptFinishedEvent, TaskAttemptFinishedEvent, etc.).
Currently, hadoop-mapreduce-client-common has dependency on
hadoop-mapreduce-client-core, so we don't allow these events under
hadoop-mapreduce-client-core to depend on JobHistoryUtils which will cause
bidirectional dependency issue. The bad news is we cannot move JobHistoryUtils
to hadoop-mapreduce-client-core either, because it has reference to other
classes (like: MRApps) that still in hadoop-mapreduce-client-common. That's why
I create JobHistoryEventUtils for shared methods.
bq. In the current way of shutting down the threadpool, is it guaranteed that
the pending entity is going to be published before shutting down?
It will have delay (60 secs) to wait pending entity get posted, and the delay
is typically much larger than service discovery time (typically saying,
heartbeat interval, not counting collector failed over case) and timeline
entity REST posting time. It also larger than every entity posting time in case
of failure with maximum retry (30 * 1 sec). So I think it could be safe to do
I will address other comments in new patch.
> [Event producers] Implement MapReduce AM writing some MR metrics to ATS
> Key: YARN-3046
> URL: https://issues.apache.org/jira/browse/YARN-3046
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Sangjin Lee
> Assignee: Junping Du
> Attachments: YARN-3046-no-test-v2.patch, YARN-3046-no-test.patch,
> YARN-3046-v1-rebase.patch, YARN-3046-v1.patch
> Per design in YARN-2928, select a handful of MR metrics (e.g. HDFS bytes
> written) and have the MR AM write the framework-specific metrics to ATS.
This message was sent by Atlassian JIRA