Sangjin Lee commented on YARN-3039:

Sorry for chiming in late. Some of the questions may have been addressed 
already, but I'll add my 2 cents.

bq. The things could be different in your modes mentioned above is who and how 
to do the registration. I would prefer some other JIRA, like: YARN-3033, could 
address these differences. Thoughts?
That sounds fine.

We need RM to write some initiative app info standalone. However, do we expect 
RM to write all app-specific info or just in the beginning? We have a similar 
case in launching app's container - the first AM container get launched by RM, 
but following containers get launched by AM. Do we want to follow this pattern 
if we want to consolidate all app info with only one app aggregator?

Didn't we want a singleton app aggregator for all app related events, logs, 
etc.? Ideally, only this singleton aggregator can have magic to sort out app 
info in aggregation. If not, we can even give up current flow "NM(s) -> app 
aggregator(deployed on one NM) -> backend" and let NM to talk to backend 
directly for saving hop for traffic. Can you clarify more on this?
All the application lifecycle events (app state transitions) should be written 
*directly* by the RM. The main reason for that is at least the app-level 
aggregator may not even be up when the application lifecycle starts. So it 
seems pretty natural for the RM to be in charge of handling application 
lifecycle events and writing them directly (it would be even more awkward to 
have the RM write the first lifecycle event directly, and all subsequent ones 
through the app-level aggregator).

The container lifecycle events (container state transitions) should be written 
by the respective NMs that handled the container state transitions through the 
right app-level aggregator. So, all the app-related writes *do* go through the 
app-level aggregator. The only exception is the RM directly writing the 
application lifecycle events.

On the side, I'd like to note that as a rule all "system" events (YARN-generic 
events that pertain to lifecycles, etc.) must be written by YARN daemons 
(directly or through the app-level aggregator), and they should *not* be 
written by the AMs as we cannot rely upon them to write them.

Does that clarify the point?

> [Aggregator wireup] Implement ATS app-appgregator service discovery
> -------------------------------------------------------------------
>                 Key: YARN-3039
>                 URL: https://issues.apache.org/jira/browse/YARN-3039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Junping Du
>         Attachments: Service Binding for applicationaggregator of ATS 
> (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
> YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
> YARN-3039-v3-core-changes-only.patch
> Per design in YARN-2928, implement ATS writer service discovery. This is 
> essential for off-node clients to send writes to the right ATS writer. This 
> should also handle the case of AM failures.

This message was sent by Atlassian JIRA

Reply via email to