zhenzhao wang created YARN-10398:
------------------------------------
Summary: Every NM will try to upload Jar/Archives/Files/Resources
to Yarn Shared Cache Manager Like DDOS
Key: YARN-10398
URL: https://issues.apache.org/jira/browse/YARN-10398
Project: Hadoop YARN
Issue Type: Bug
Components: yarn
Affects Versions: 3.1.3, 3.2.1, 3.1.2, 3.0.3, 2.9.2, 3.1.1, 3.2.0, 3.0.2,
3.0.1, 2.9.1, 3.1.0, 3.0.0, 2.9.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 3.4.0,
3.3.1, 3.1.5
Reporter: zhenzhao wang
Assignee: zhenzhao wang
The design of yarn shared cache manager is only to allow application master
should upload the jar/files/resource. However, there was a bug in the code
since 2.9.0. Every node manager that take the job task will try to upload the
jar/resources. Let's say one job have 5000 tasks. Then there will be up to 5000
NMs try to upload the jar. This is like DDOS and create a snowball effect. It
will end up with inavailability of yarn shared cache manager. It wil cause time
out in localization and lead to job failure.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]