[ 
https://issues.apache.org/jira/browse/YARN-10398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182949#comment-17182949
 ] 

zhenzhao wang commented on YARN-10398:
--------------------------------------

[~jiwq] I double checked and confirmed the PR is the fix for the problem. The 
reason why non-application master try to upload is because the clear code 
didn't work. The code and bug are in YARN. MR uses yarn shared cache. I'm not 
sure we should move it MR project.  Thanks.

> Every NM will try to upload Jar/Archives/Files/Resources to Yarn Shared Cache 
> Manager Like DDOS
> -----------------------------------------------------------------------------------------------
>
>                 Key: YARN-10398
>                 URL: https://issues.apache.org/jira/browse/YARN-10398
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.9.0, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 3.0.2, 3.2.0, 3.1.1, 
> 2.9.2, 3.0.3, 3.0.4, 3.1.2, 3.3.0, 3.2.1, 2.9.3, 3.1.3, 3.2.2, 3.1.4, 3.4.0, 
> 3.3.1, 3.1.5
>            Reporter: zhenzhao wang
>            Assignee: zhenzhao wang
>            Priority: Major
>
> The design of yarn shared cache manager is only to allow application master 
> should upload the jar/files/resource. However, there was a bug in the code 
> since 2.9.0. Every node manager that take the job task will try to upload the 
> jar/resources. Let's say one job have 5000 tasks. Then there will be up to 
> 5000 NMs try to upload the jar. This is like DDOS and create a snowball 
> effect. It will end up with inavailability of yarn shared cache manager. It 
> wil cause time out in localization and lead to job failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to