Sangjin Lee commented on YARN-3039:
Sorry for chiming in late. Some of the questions may have been addressed
already, but I'll add my 2 cents.
bq. The things could be different in your modes mentioned above is who and how
to do the registration. I would prefer some other JIRA, like: YARN-3033, could
address these differences. Thoughts?
That sounds fine.
We need RM to write some initiative app info standalone. However, do we expect
RM to write all app-specific info or just in the beginning? We have a similar
case in launching app's container - the first AM container get launched by RM,
but following containers get launched by AM. Do we want to follow this pattern
if we want to consolidate all app info with only one app aggregator?
Didn't we want a singleton app aggregator for all app related events, logs,
etc.? Ideally, only this singleton aggregator can have magic to sort out app
info in aggregation. If not, we can even give up current flow "NM(s) -> app
aggregator(deployed on one NM) -> backend" and let NM to talk to backend
directly for saving hop for traffic. Can you clarify more on this?
All the application lifecycle events (app state transitions) should be written
*directly* by the RM. The main reason for that is at least the app-level
aggregator may not even be up when the application lifecycle starts. So it
seems pretty natural for the RM to be in charge of handling application
lifecycle events and writing them directly (it would be even more awkward to
have the RM write the first lifecycle event directly, and all subsequent ones
through the app-level aggregator).
The container lifecycle events (container state transitions) should be written
by the respective NMs that handled the container state transitions through the
right app-level aggregator. So, all the app-related writes *do* go through the
app-level aggregator. The only exception is the RM directly writing the
application lifecycle events.
On the side, I'd like to note that as a rule all "system" events (YARN-generic
events that pertain to lifecycles, etc.) must be written by YARN daemons
(directly or through the app-level aggregator), and they should *not* be
written by the AMs as we cannot rely upon them to write them.
Does that clarify the point?
> [Aggregator wireup] Implement ATS app-appgregator service discovery
> Key: YARN-3039
> URL: https://issues.apache.org/jira/browse/YARN-3039
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Sangjin Lee
> Assignee: Junping Du
> Attachments: Service Binding for applicationaggregator of ATS
> (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf,
> YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch,
> Per design in YARN-2928, implement ATS writer service discovery. This is
> essential for off-node clients to send writes to the right ATS writer. This
> should also handle the case of AM failures.
This message was sent by Atlassian JIRA