[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336707#comment-14336707
 ] 

Junping Du commented on YARN-3039:
----------------------------------

Thanks [~Naganarasimha] and [~rkanter] for review and comments!

bq. I feel AM should be informed of AggregatorAddr as early as register itself 
than currently being done in ApplicationMasterService.allocate().
That's a good point. Another idea (from Vinod in offline discussion) is to add 
a blocking call in AMRMClient to get aggregator address directly from RM. 
AMRMClient can be wrapped into TimelineClient so no aggregator address or 
aggregator failure can be handled transparently. Thoughts?

bq. For NM's too, would it be better to update during registering itself (may 
be recovered during recovery, not sure though) thoughts?
I think NM case is slightly different here: NM need this knowledge whenever the 
first container of this app get allocated/launched, so get things updated in 
heartbeat sounds good enough. Isn't it? In addition, if adding a new API in 
AMRMClient can be accepted, NM will use TimelineClient too so can handle 
service discovery automatically.


bq. Was not clear about source of RMAppEventType.AGGREGATOR_UPDATE. Based on 
YARN-3030 (Aggregators collection through NM's Aux service), 
PerNodeAggregatorServer(Aux service) launches AppLevelAggregatorService, so 
will AppLevelAggregatorService inform RM about the aggregator for the 
application? and then RM will inform NM about the appAggregatorAddr as part of 
heart beat response ? if this is the flow will there be chances of race 
condition where in before NM gets appAggregatorAddr from RM, NM might require 
to post some AM container Entities/events?
I think we can discuss this flow in two scenarios, the first time launch of app 
aggregator and app aggregator failed over on another NM:
For the first time launch of app aggregator, NM aux service will bind the app 
aggregator to perNodeAggregator when AM container get allocated (per 
YARN-3030). NM will notify RM that this new appAggregator is ready for use in 
next heartbeat to RM (missing in this patch). After received this messsage from 
NM, RM with update its aggregator list and send 
RMAppEventType.AGGREGATOR_UPDATE to trigger persistent of aggregator list 
updating in RMStateStore (for RM failed over).
For app aggregator get failed over, AM or NMs (who called putEntities with 
timelineClient) will notify RM on this failure, RM verify the out of service 
for this app aggregator first and kick off rebind appAggregator to another NM's 
perNodeAggregatorService in next heartbeat comes. When hear back from this new 
NM, RM did the same thing as the 1st case.
One gap here today is we launched appAggregatorService (by NM's auxiliary 
service) whenever AM container get launched, no matter first time launch or 
rescheduled as failed before. As my early comments above - AM container failed 
over with rescheduled to other NM may not have to cause rebind of aggregator 
service just like out of service for app's aggregator may not cause AM 
container get killed. So I think appAggregatorService should get launched by NM 
automatic only in first attemp and taken care by RM in next attempts. 
About rack condition between NM heartbeat with posting entities, I don't think 
posting entities should block any major logic especially NM heartbeat. In 
addition, if we make TimelineClient can handle service discovery automatically, 
this will never happen. What do you think?

bq. Sorry for not commenting earlier. Thanks for taking this up Junping Du.
No worry. Thanks!

bq. Not using YARN-913 is fine if it's not going to make sense. I haven't 
looked too closely at it either; it just sounded like it might be helpful here.
Agree. My feeling now is service discovery get couple tightly with service 
lifecycle management. Given our app aggregator service - not inside of a 
dedicated container, but have many options, and its consumer include YARN 
components but not only AM. So I think YARN-913 may not be the best fit at this 
moment.
 [~ste...@apache.org] is the main author of YARN-913. Steve, do you have any 
comments here?

bq. Given that a particular NM is only interested in the Applications that are 
running on it, is there some way to have it only receive the aggregator info 
for those apps? This would decrease the amount of "throw away" data that gets 
sent.
In current patch, RM only send NM the aggregator lists for active Apps on this 
container. Please check the code in ResourceTrackerService:  
{code}
+    ConcurrentMap<ApplicationId, String> liveAppAggregatorsMap = new 
+        ConcurrentHashMap<ApplicationId, String>();
+    List<ApplicationId> keepAliveApps = 
remoteNodeStatus.getKeepAliveApplications();
+    if (keepAliveApps != null) {
+      ConcurrentMap<ApplicationId, RMApp> rmApps = rmContext.getRMApps();
+      for (ApplicationId appId : keepAliveApps) {
+        String appAggregatorAddr = rmApps.get(appId).getAggregatorAddr();
+        if (appAggregatorAddr != null) {
+          liveAppAggregatorsMap.put(appId, appAggregatorAddr);
+        } else {
+          // Log a debug info if aggregator address is not found.
+          if (LOG.isDebugEnabled()) {
+            LOG.debug("Aggregator for applicaton: " + appId + " hasn't 
registered yet!");
+          }
+        }
+      }
+      nodeHeartBeatResponse.setAppAggregatorsMap(liveAppAggregatorsMap);
+    }
{code}
In addition, as reply to [~zjshen]'s comments above, we can even improve to 
include interested appAggregators only sent by NM - assume NM can detect the 
failure of aggregator which may not be true if we wrap everything in 
TimelineClient (with new API in AMRMClient to retrive aggregator address)

bq. Also, can you update the design doc? Looking at the patch, it seems like 
some things have changed. (e.g. it's using protobufs instead of REST; which I 
think makes more sense here anyway).
I will. Many things need to get updated, and many details are getting more 
clear.

> [Aggregator wireup] Implement ATS writer service discovery
> ----------------------------------------------------------
>
>                 Key: YARN-3039
>                 URL: https://issues.apache.org/jira/browse/YARN-3039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Junping Du
>         Attachments: Service Binding for applicationaggregator of ATS 
> (draft).pdf, YARN-3039-no-test.patch
>
>
> Per design in YARN-2928, implement ATS writer service discovery. This is 
> essential for off-node clients to send writes to the right ATS writer. This 
> should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to