[ 
https://issues.apache.org/jira/browse/YARN-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049039#comment-15049039
 ] 

Junping Du commented on YARN-4356:
----------------------------------

Thanks [~sjlee0] for delivering the patch. 004 patch looks good to me in 
overall, but some minor comments:
In JobHistoryEventHandler.java,
{code}
LOG.info("Emitting job history data to the timeline server is enabled");
{code}
server => service, given we don't provide a centralized server in v2.

{code} 
+        LOG.info("Timeline service is enabled; version: " +
+            (timelineServiceV2Enabled? "v2" : "v1"));
{code}
Shall we get configuration version from configuration defined in YARN-3623? 
Especially, we have other versions, like: v1.5.

TestMRTimelineEventHandling.java,
{code}
// enable new timeline serivce
{code}
typo: serivce => service

YarnConfiguration.java,
{code}
@return whether the timelien service is enabled.
{code}
typo: timelien => timeline

{code}
+  public static boolean timelineServiceV2Enabled(Configuration conf) {
+    return timelineServiceEnabled(conf) && getTimelineServiceVersion(conf) == 
2;
+  }
{code}
Would it be possible to have ATS v2.5 in future? If so, may be we should cast 
float number get from version config before comparing with 2?


In NodeHeartbeatRequestPBImpl.java,
{code}
+      this.registeredCollectors = new HashMap<>();
Update new HashMap<>() => new HashMap<ApplicationId, String> ()
{code}

and the same problem in NodeHeartbeatResponsePBImpl.java
{code}
this.appCollectorsMap = new HashMap<>();
{code}

In NodeManager.java,
{code}
this.registeredCollectors = new ConcurrentHashMap<>();
{code}
We should also add back types as requirement/convension for generics.

{code}
+      if (this.registeredCollectors != null) {
+        this.registeredCollectors.putAll(newRegisteredCollectors);
+      }
{code}
This check of null is unnecessary as the only caller - NMCollectorService is 
only running under v2 is enabled. If for some reason, we get NPE here which is 
still better than we ignore it silently.

In NodeStatusUpdaterImpl.java,
{code}
-      /**
-       * Caller should take care of sending non null nodelabels for both
-       * arguments
-       * 
-       * @param nodeLabelsNew
-       * @param nodeLabelsOld
-       * @return if the New node labels are diff from the older one.
-       */
-      private boolean areNodeLabelsUpdated(Set<NodeLabel> nodeLabelsNew,
-          Set<NodeLabel> nodeLabelsOld) {
-        if (nodeLabelsNew.size() != nodeLabelsOld.size()
-            || !nodeLabelsOld.containsAll(nodeLabelsNew)) {
-          return true;
-        }
-        return false;
-      }
{code}
Please remove this unrelated change out for more focus and better tracking.

{code}
+                !context.getRegisteredCollectors().containsKey(appId)) {
{code}
I think this logic could be problemtic if collector address get updated due to 
NM restart or collector service failure. However, this shouldn't be addressed 
in this JIRA. But kindly add a TODO would a good reminder. 

In ContainerManagerImpl.java,
{code}
+        } else {
+          flowContext = null;
         }
{code}
this else branch is not necessary as we can define flowContext to be null at 
the beginning.

In ResourceManager.java,
{code}
+      if (version < 2 &&
...
+      } else if (version == 2 &&
{code}
Can we continuously to use YarnConfiguration.timelineServiceV2Enabled() in 
every place? Or we could miss these places if we need to change version logic 
in future.

In ResourceTrackerService.java,
{code}
     List<ApplicationId> keepAliveApps =
         remoteNodeStatus.getKeepAliveApplications();
-    if (keepAliveApps != null) {
+    if (timelineV2Enabled && keepAliveApps != null) {
{code}
Just a reminder, keepAliveApps is a wrong list to identify running apps on 
specific node. YARN-3586 (with patch) is already filed to fix this. We can 
either merge that patch in or rebase that patch when this patch done.

In TimelineServiceV2Publisher.java,
{code}
- * This class is responsible for posting application, appattempt & Container
+ * This class is responsible for posting application, appattempt &amp; 
Container
{code}
Why we need this change?

In PerNodeTimelineCollectorsAuxService.java,
{code}
+      // enable timeline service v.2
+      conf.setBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED, true);
+      conf.setFloat(YarnConfiguration.TIMELINE_SERVICE_VERSION, 2.0f);
{code}
We should disable PerNodeTimelineCollectorsAuxService if we don't enable 
timeline service v2. Isn't it? If so, I think this is not a necessary change 
and we should remove.

In TimelineReaderServer.java,
{code}
+    YarnConfiguration conf = new YarnConfiguration();
+    // enable timeline service v.2
+    conf.setBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED, true);
+    conf.setFloat(YarnConfiguration.TIMELINE_SERVICE_VERSION, 2.0f);
{code}
The same question with above.

In addition, I think we should split the change that duplicated with YARN-3623 
and cherry-pick it from trunk/branch-2 when that patch get commit in.

> ensure the timeline service v.2 is disabled cleanly and has no impact when 
> it's turned off
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-4356
>                 URL: https://issues.apache.org/jira/browse/YARN-4356
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-4356-feature-YARN-2928.002.patch, 
> YARN-4356-feature-YARN-2928.003.patch, YARN-4356-feature-YARN-2928.004.patch, 
> YARN-4356-feature-YARN-2928.poc.001.patch
>
>
> For us to be able to merge the first milestone drop to trunk, we want to 
> ensure that once disabled the timeline service v.2 has no impact from the 
> server side to the client side. If the timeline service is not enabled, no 
> action should be done. If v.1 is enabled but not v.2, v.1 should behave the 
> same as it does before the merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to