Hi all,
I have studied YARN for several months, and have some thinking on the event
model of YARN.
1. The event model do help the performance of YARN by allowing async call
2. But the event model make the boundary of each component unclear. The
event receiver do not know the sender of this event which make the reader
difficult to understand the event flow.
E.g. in node manager, there's several event sender and handler which
include container , application, localization server, log aggregation
service and so on. One component will send event to another component.
Because of the lack of the event sender in receiver, it is not easy to read
the code and understand the event flow.
The event flow in resource manager is even more complex which involve
the RMApp, RMAppAttempt, RMContainer, RMNode, Scheduler
3. INHO, the complexity of the event model make new contributor hard to
understand the code base, and hard to maintain the codebase in future. One
small change in the state machine may affect the other component and
difficult to find the cause.
Just wondering is there already some thinking on the event mode of YARN.
And correct me if my understanding if wrong.
Thanks
Jeff Zhang