[
https://issues.apache.org/jira/browse/YARN-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939320#comment-16939320
]
Zhankun Tang commented on YARN-9861:
------------------------------------
[~billie.rinaldi], if any chance, could you please take a look at this?
The issue happens when running the submarine per offline discussion. It seems
caused by yarn native service leaks the socket/hdfs file handles. Thoughts?
> The ResourceManager log reports an error "Too many open files", the analysis
> is related to the service
> ------------------------------------------------------------------------------------------------------
>
> Key: YARN-9861
> URL: https://issues.apache.org/jira/browse/YARN-9861
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn-native-services
> Affects Versions: 3.3.0
> Environment: yarn version:3.3.0-SNAPSHOT
> hdfs version:2.7.1
> Reporter: jason
> Priority: Major
> Attachments: picture1.png, picture2.png, picture3.png, picture4.png,
> picture5.png, submarine_kerasgesv2date20190807.json
>
>
> The ResourceManager log outputs "Too many open files" and cannot commit a new
> task.
> 1. First is the error in picture1,
> 2. Then check the file handle open by RM (lsof -p PID), see picture 2,
> 3. Also read nameNode audit log (picture 3),
> 4. Confirm about service according to the path of service configuration
> (picture 4),
> 5. Handle number growth trend (picture 5).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]