[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359603#comment-14359603
 ] 

Sangjin Lee commented on YARN-3039:
-----------------------------------

[~djp], thanks for your super prompt reply and the update! I'm going to go over 
your reply and the new patch soon, but wanted to clarify one point. You said

{quote}
In steady state, there is no initialized delay becuase timelineServiceAddress 
is already there (in timelineClient). The cost here only happens for the first 
event when timelineClient start to post or after timelineServiceAddress get 
updated (for failure or other reasons). We design this to make sure 
TimelineClient can handle service discovery itself rather than letting caller 
to figure it out.
{quote}

I'm not quite sure if I understood this part. The code in question is this for 
example:

{code}
368       @Private
369       public void putObjects(String path, MultivaluedMap<String, String> 
params, 
370           Object obj) throws IOException, YarnException {
371         
372         // timelineServiceAddress could haven't be initialized yet 
373         // or stale (only for new timeline service)
374         int retries = pollTimelineServiceAddress(this.maxServiceRetries);
375
{code}

The putObjects() method can be called in a steady state (i.e. long after the 
timeline service address is initialized), right? Then *don't we want to check 
if timelineServiceAddress is null* before proceeding to poll for it? Like 
(lines 372 and 376):

{code}
368       @Private
369       public void putObjects(String path, MultivaluedMap<String, String> 
params, 
370           Object obj) throws IOException, YarnException {
371         
372      if (timelineServiceAddress == null) {
373           // timelineServiceAddress could haven't be initialized yet 
374           // or stale (only for new timeline service)
375           int retries = pollTimelineServiceAddress(this.maxServiceRetries);
376      }
377
{code}

Without that null check, invocations of putObjects() would *always* call 
pollTimelineServiceAddress() even if timelineServiceAddress is already set, 
right? My understanding is that we shouldn't even poll if 
timelineServiceAddress is already populated. It is possible for the value to 
have changed, but that's covered by the later check when you handle the 
exception of posting the entities. Did I miss something?

> [Aggregator wireup] Implement ATS app-appgregator service discovery
> -------------------------------------------------------------------
>
>                 Key: YARN-3039
>                 URL: https://issues.apache.org/jira/browse/YARN-3039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Junping Du
>         Attachments: Service Binding for applicationaggregator of ATS 
> (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
> YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
> YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch
>
>
> Per design in YARN-2928, implement ATS writer service discovery. This is 
> essential for off-node clients to send writes to the right ATS writer. This 
> should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to