Junping Du commented on YARN-2033:

[~zjshen], thanks for comments above which sounds good to me. I just go through 
your latest patch, a couple of comments:
+  public static final String 
+      RM_PREFIX + "metrics-publisher.multi-threaded-dispatcher.pool-size";
+  public static final int 
+      10;
The name of config looks like too long. May be we can rename it to something 
shorter, i.e. RM_PREFIX + "metrics-publisher.dispatcher.pool-size"?

-  optional string diagnostics = 5 [default = "N/A"];
-  optional YarnApplicationAttemptStateProto yarn_application_attempt_state = 6;
-  optional ContainerIdProto am_container_id = 7;
+  optional string original_tracking_url = 5;
+  optional string diagnostics = 6 [default = "N/A"];
+  optional YarnApplicationAttemptStateProto yarn_application_attempt_state = 7;
+  optional ContainerIdProto am_container_id = 8;
We shouldn't insert a new field as it will change the order of existing fields. 
In PB, the encoded messages only include field type and number and will be map 
to field name when decoding. Thus, change the field number here will break 
compatibility which is unnecessary. Add original_tracking_url with field number 
of 8 should be fine.

-    if (conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED,
-      historyServiceEnabled = true;
+    if (conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE) == null
+        && conf.getBoolean(YarnConfiguration.RM_METRICS_PUBLISHER_ENABLED,
+            YarnConfiguration.DEFAULT_RM_METRICS_PUBLISHER_ENABLED) ||
+        conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE) != null
+        && conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED,
+            YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) {
+      yarnMetricsEnabled = true;
If user's config is slightly wrong (let's assume: 
YarnConfiguration.APPLICATION_HISTORY_STORE != null, 
YarnConfiguration.RM_METRICS_PUBLISHER_ENABLED=true), then here we disable 
yarnMetricsEnabled sliently which make trouble-shooting effort a little harder. 
Suggest to log warn messages when user's wrong configuration happens. Better to 
move logic operations inside of if() to a separated method and log the error 
for wrong configuration. 

+  <property>
+       <description>The setting that controls whether yarn metrics is 
published on
+    the timeline server or not by RM.</description>
+       <name>yarn.resourcemanager.metrics-publisher.enabled</name>
+       <value>false</value>
+  </property>
Indentation should be 2 white spaces instead of tab.

In ApplicationHistoryManagerOnTimelineStore.java,
} catch (YarnException e) {
+      throw new IOException(e);
+    }
This kind of exception translate is unnecessary to me. We can remove it as 
YarnException get throw here. If we decide to throw IOException only (please 
see my comments later), we can extend the block to cover more code that could 
throw YarnException and translate to IOException.

The method of convertToApplicationReport seems a little too sophisticated in 
creating applicationReport. Another option is to wrapper it as Builder pattern 
(plz refer in MiniDFSCluster) should be better. The same comments on 
convertToApplicationAttemptReport and convertToContainerReport. This is only 
optional comments, see if you want to address here or some separate JIRA in 

+  public ApplicationAttemptReport getApplicationAttempt(
+      ApplicationAttemptId appAttemptId) throws YarnException, IOException {
+    getApplication(appAttemptId.getApplicationId(), 
+    TimelineEntity entity = null;
Why do we need getApplication(appAttemptId.getApplicationId(), 
ApplicationReportField.NONE) here? IMO, the only work here is to check if 
applicationId is valid, but we have check on appAttemptId later. so we may 
consider to remove it if unnecessary. In addition, may be 
ApplicationReportField.NONE is not useful?

-        new Path(conf.get(YarnConfiguration.FS_APPLICATION_HISTORY_STORE_URI));
+        new Path(conf.get(YarnConfiguration.FS_APPLICATION_HISTORY_STORE_URI,
+            conf.get("hadoop.tmp.dir") + "/yarn/timeline/generic-history"));
We should replace "hadoop.tmp.dir" and "/yarn/timeline/generic-history" with 
constant string in YarnConfiguration. BTW, "hadoop.tmp.dir" may not be 

In ApplicationContext.java,
    * @return {@link ApplicationReport} for the ApplicationId.
+   * @throws YarnException
    * @throws IOException
-  ApplicationReport getApplication(ApplicationId appId) throws IOException;
+  ApplicationReport getApplication(ApplicationId appId)
+      throws YarnException, IOException;
For public API (although marked as unstable), adding a new exception will break 
compatibility of RPC as old version client don't know how to deal with new 
exception. We'd better to keep the compatibility here with throwing IOException 
only and translating YarnExcetption to IOException inside server side.

In AppInfo.java,
-    ApplicationResourceUsageReport usage =
-        app.getApplicationResourceUsageReport();
-    if (usage != null) {
-      allocatedMB = usage.getUsedResources().getMemory();
-      allocatedVCores = usage.getUsedResources().getVirtualCores();
-    }
I am not sure if this change (and other changes in this class) is necessary. If 
not, we can remove it.

In YarnMetricsPublisher.java,
+  @SuppressWarnings({ "rawtypes", "unchecked" })
+  protected static class MultiThreadedDispatcher extends CompositeService
+      implements Dispatcher {
+    private List<AsyncDispatcher> dispatchers =
+        new ArrayList<AsyncDispatcher>();
We already have the same implementation of MultiThreadedDispatcher in 
RMApplicationHistoryWriter.java. We should consolidate two to an indepedent 
class rather than two inner classes.

More comments may comes later.

> Investigate merging generic-history into the Timeline Store
> -----------------------------------------------------------
>                 Key: YARN-2033
>                 URL: https://issues.apache.org/jira/browse/YARN-2033
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Zhijie Shen
>         Attachments: ProposalofStoringYARNMetricsintotheTimelineStore.pdf, 
> YARN-2033.1.patch, YARN-2033.2.patch, YARN-2033.3.patch, YARN-2033.4.patch, 
> YARN-2033.5.patch, YARN-2033.Prototype.patch, YARN-2033_ALL.1.patch, 
> YARN-2033_ALL.2.patch, YARN-2033_ALL.3.patch, YARN-2033_ALL.4.patch
> Having two different stores isn't amicable to generic insights on what's 
> happening with applications. This is to investigate porting generic-history 
> into the Timeline Store.
> One goal is to try and retain most of the client side interfaces as close to 
> what we have today.

This message was sent by Atlassian JIRA

Reply via email to