[ https://issues.apache.org/jira/browse/YARN-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303124#comment-17303124 ]
Qi Zhu edited comment on YARN-9618 at 3/18/21, 2:46 AM: -------------------------------------------------------- [~Jim_Brennan] [~gandras] [~ebadger] [~pbacsko] Added the EventDispatcher<NodeListManagerEvent> in created logic, to make sure safe. Also, i have added a press test, with 1000 nodes, with 1000 rmApps, confirmed that total 100000 event will trigger in RMApp handle. If you any other advice? Thanks. was (Author: zhuqi): [~gandras] [~ebadger] [~pbacsko] Added the EventDispatcher<NodeListManagerEvent> in created logic, to make sure safe. Also, i have added a press test, with 1000 nodes, with 1000 rmApps, confirmed that total 100000 event will trigger in RMApp handle. If you any other advice? Thanks. > NodeListManager event improvement > --------------------------------- > > Key: YARN-9618 > URL: https://issues.apache.org/jira/browse/YARN-9618 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Bibin Chundatt > Assignee: Qi Zhu > Priority: Critical > Attachments: YARN-9618.001.patch, YARN-9618.002.patch, > YARN-9618.003.patch, YARN-9618.004.patch, YARN-9618.005.patch > > > Current implementation nodelistmanager event blocks async dispacher and can > cause RM crash and slowing down event processing. > # Cluster restart with 1K running apps . Each usable event will create 1K > events over all events could be 5k*1k events for 5K cluster > # Event processing is blocked till new events are added to queue. > Solution : > # Add another async Event handler similar to scheduler. > # Instead of adding events to dispatcher directly call RMApp event handler. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org