[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336707#comment-14336707
]
Advertising
Junping Du commented on YARN-3039:
----------------------------------
Thanks [~Naganarasimha] and [~rkanter] for review and comments!
bq. I feel AM should be informed of AggregatorAddr as early as register itself
than currently being done in ApplicationMasterService.allocate().
That's a good point. Another idea (from Vinod in offline discussion) is to add
a blocking call in AMRMClient to get aggregator address directly from RM.
AMRMClient can be wrapped into TimelineClient so no aggregator address or
aggregator failure can be handled transparently. Thoughts?
bq. For NM's too, would it be better to update during registering itself (may
be recovered during recovery, not sure though) thoughts?
I think NM case is slightly different here: NM need this knowledge whenever the
first container of this app get allocated/launched, so get things updated in
heartbeat sounds good enough. Isn't it? In addition, if adding a new API in
AMRMClient can be accepted, NM will use TimelineClient too so can handle
service discovery automatically.
bq. Was not clear about source of RMAppEventType.AGGREGATOR_UPDATE. Based on
YARN-3030 (Aggregators collection through NM's Aux service),
PerNodeAggregatorServer(Aux service) launches AppLevelAggregatorService, so
will AppLevelAggregatorService inform RM about the aggregator for the
application? and then RM will inform NM about the appAggregatorAddr as part of
heart beat response ? if this is the flow will there be chances of race
condition where in before NM gets appAggregatorAddr from RM, NM might require
to post some AM container Entities/events?
I think we can discuss this flow in two scenarios, the first time launch of app
aggregator and app aggregator failed over on another NM:
For the first time launch of app aggregator, NM aux service will bind the app
aggregator to perNodeAggregator when AM container get allocated (per
YARN-3030). NM will notify RM that this new appAggregator is ready for use in
next heartbeat to RM (missing in this patch). After received this messsage from
NM, RM with update its aggregator list and send
RMAppEventType.AGGREGATOR_UPDATE to trigger persistent of aggregator list
updating in RMStateStore (for RM failed over).
For app aggregator get failed over, AM or NMs (who called putEntities with
timelineClient) will notify RM on this failure, RM verify the out of service
for this app aggregator first and kick off rebind appAggregator to another NM's
perNodeAggregatorService in next heartbeat comes. When hear back from this new
NM, RM did the same thing as the 1st case.
One gap here today is we launched appAggregatorService (by NM's auxiliary
service) whenever AM container get launched, no matter first time launch or
rescheduled as failed before. As my early comments above - AM container failed
over with rescheduled to other NM may not have to cause rebind of aggregator
service just like out of service for app's aggregator may not cause AM
container get killed. So I think appAggregatorService should get launched by NM
automatic only in first attemp and taken care by RM in next attempts.
About rack condition between NM heartbeat with posting entities, I don't think
posting entities should block any major logic especially NM heartbeat. In
addition, if we make TimelineClient can handle service discovery automatically,
this will never happen. What do you think?
bq. Sorry for not commenting earlier. Thanks for taking this up Junping Du.
No worry. Thanks!
bq. Not using YARN-913 is fine if it's not going to make sense. I haven't
looked too closely at it either; it just sounded like it might be helpful here.
Agree. My feeling now is service discovery get couple tightly with service
lifecycle management. Given our app aggregator service - not inside of a
dedicated container, but have many options, and its consumer include YARN
components but not only AM. So I think YARN-913 may not be the best fit at this
moment.
[~ste...@apache.org] is the main author of YARN-913. Steve, do you have any
comments here?
bq. Given that a particular NM is only interested in the Applications that are
running on it, is there some way to have it only receive the aggregator info
for those apps? This would decrease the amount of "throw away" data that gets
sent.
In current patch, RM only send NM the aggregator lists for active Apps on this
container. Please check the code in ResourceTrackerService:
{code}
+ ConcurrentMap<ApplicationId, String> liveAppAggregatorsMap = new
+ ConcurrentHashMap<ApplicationId, String>();
+ List<ApplicationId> keepAliveApps =
remoteNodeStatus.getKeepAliveApplications();
+ if (keepAliveApps != null) {
+ ConcurrentMap<ApplicationId, RMApp> rmApps = rmContext.getRMApps();
+ for (ApplicationId appId : keepAliveApps) {
+ String appAggregatorAddr = rmApps.get(appId).getAggregatorAddr();
+ if (appAggregatorAddr != null) {
+ liveAppAggregatorsMap.put(appId, appAggregatorAddr);
+ } else {
+ // Log a debug info if aggregator address is not found.
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Aggregator for applicaton: " + appId + " hasn't
registered yet!");
+ }
+ }
+ }
+ nodeHeartBeatResponse.setAppAggregatorsMap(liveAppAggregatorsMap);
+ }
{code}
In addition, as reply to [~zjshen]'s comments above, we can even improve to
include interested appAggregators only sent by NM - assume NM can detect the
failure of aggregator which may not be true if we wrap everything in
TimelineClient (with new API in AMRMClient to retrive aggregator address)
bq. Also, can you update the design doc? Looking at the patch, it seems like
some things have changed. (e.g. it's using protobufs instead of REST; which I
think makes more sense here anyway).
I will. Many things need to get updated, and many details are getting more
clear.
> [Aggregator wireup] Implement ATS writer service discovery
> ----------------------------------------------------------
>
> Key: YARN-3039
> URL: https://issues.apache.org/jira/browse/YARN-3039
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Sangjin Lee
> Assignee: Junping Du
> Attachments: Service Binding for applicationaggregator of ATS
> (draft).pdf, YARN-3039-no-test.patch
>
>
> Per design in YARN-2928, implement ATS writer service discovery. This is
> essential for off-node clients to send writes to the right ATS writer. This
> should also handle the case of AM failures.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)