Vinod Kumar Vavilapalli commented on YARN-3044:

It's not just events. We need to model entities, their information, events, 
metrics and configuration. The obvious list of entities is Application, 
ApplicationAttempt and Container.

Application and ApplicationAttempt are easy to tackle and scalable for all 
their data.

Container information has scalability concerns what with thousands of 
containers getting allocated and finishing every second in a big enough 
cluster. Again, just the container life-cycle events are okay, but sending 
metrics for each container (a very important use-case) is a challenge. Assuming 
just the metric snapshots, we are talking about 5K containers * 10 metrics = 
50K writes per second on a big cluster. If we also have to track time-series (I 
want to), it's ( 50K writes per second ) * (Container duration) number of 
events on storage.

> [Event producers] Implement RM writing app lifecycle events to ATS
> ------------------------------------------------------------------
>                 Key: YARN-3044
>                 URL: https://issues.apache.org/jira/browse/YARN-3044
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.

This message was sent by Atlassian JIRA

Reply via email to