Junping Du commented on YARN-3045:

Thanks [~sjlee0] and [~Naganarasimha] for quickly reply.
bq. If these events are attributes of applications, then they should be on the 
application entities. If I want to find out all events for some application, 
then I should be able to query only the application entity and get all events.
Some of these events are related to both application and NodeManager. We can 
claim that it belongs to application but we can see that some events are too 
detailed to application but could be more interested for YARN daemons. I can 
understand that our design is more application centric now but should be 
generic enough to store/retrival YARN daemon centric entities later. Anyway, 
before making NM/RM onboard as the first class consumer of ATSv2, I am fine 
with making them as application events.

bq. The need to have NodeManagerEntity is something different IMO. Note that 
today there are challenges in emitting data without any application context 
(e.g. node manager's configuration) as we discussed a few times. If we need to 
support that, that needs a different discussion.
I see. I remember to see a JIRA work is to get ride of application context but 
cannot find it now. In case we don't have it, how about move this discussion to 
YARN-3959? The original scope of that JIRA is application related configuration 
only but we could extend it to include daemon configuration if necessary.

bq. my assumption was that the sync/async distinction from the client 
perspective mapped to whether the writer may be flushed or not. If not, then we 
need to support a 2x2 matrix of possibilities: sync put w/ flush, sync put w/o 
flush, async put w/ flush, and async put w/o flush. I thought it would be a 
simplifying assumption to align those dimensions.
I think we can simplify 2x2 matrix by omitting the case of sync put w/o flush 
as I cannot think a valid case that ack from TimelineCollector without flush 
can help on. Rest of three cases sounds solid to me. To make TimelineCollector 
can identify flush strategies with async calls, we may need to set severity on 
entities need to put and TimelineCollector is configured to flush entities only 
above specific severity just like log level does.

bq. I was under the impression that YARN-3367 is only for invoking REST calls 
in nonblocking way and thus avoiding threads in the clients. Is it also related 
to flush when called only putEntities and not on putEntitiesAsync?
You are right that the goal of YARN-3367 is to get rid of blocking call to put 
entities, no matter it calls putEntities() or something else. 
putEntitiesAsync() is exactly what we need, and it should be rare case to use 
putEntities() once we have putEntitiesAsync except client logic rely on return 
results tightly.

bq. I see currently "async" parameter as part of REST request is ignored now, 
so i thought based on this param we may need to further flush the writer or is 
your thoughts similar to support 2*2 matrix as Sangjin was informing?
Actually, from my above comments, I would prefer the way of (2*2 - 1). :) To 
speed up this JIRA's progress, I am fine with keep ignoring sync/async 
parameter and do everything async for now and left it out to a dedicated JIRA 
to figure out.

Will look at latest patch soon.

> [Event producers] Implement NM writing container lifecycle events to ATS
> ------------------------------------------------------------------------
>                 Key: YARN-3045
>                 URL: https://issues.apache.org/jira/browse/YARN-3045
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3045-YARN-2928.002.patch, 
> YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, 
> YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, 
> YARN-3045-YARN-2928.007.patch, YARN-3045-YARN-2928.008.patch, 
> YARN-3045.20150420-1.patch
> Per design in YARN-2928, implement NM writing container lifecycle events and 
> container system metrics to ATS.

This message was sent by Atlassian JIRA

Reply via email to