zhenzhao wang created YARN-10398: ------------------------------------ Summary: Every NM will try to upload Jar/Archives/Files/Resources to Yarn Shared Cache Manager Like DDOS Key: YARN-10398 URL: https://issues.apache.org/jira/browse/YARN-10398 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 3.1.3, 3.2.1, 3.1.2, 3.0.3, 2.9.2, 3.1.1, 3.2.0, 3.0.2, 3.0.1, 2.9.1, 3.1.0, 3.0.0, 2.9.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 3.4.0, 3.3.1, 3.1.5 Reporter: zhenzhao wang Assignee: zhenzhao wang
The design of yarn shared cache manager is only to allow application master should upload the jar/files/resource. However, there was a bug in the code since 2.9.0. Every node manager that take the job task will try to upload the jar/resources. Let's say one job have 5000 tasks. Then there will be up to 5000 NMs try to upload the jar. This is like DDOS and create a snowball effect. It will end up with inavailability of yarn shared cache manager. It wil cause time out in localization and lead to job failure. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org