[
https://issues.apache.org/jira/browse/YARN-11530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated YARN-11530:
----------------------------------
Labels: pull-request-available (was: )
> Server$Listener stating too many open files when setting
> ipc.server.read.threadpool.size big enough
> ---------------------------------------------------------------------------------------------------
>
> Key: YARN-11530
> URL: https://issues.apache.org/jira/browse/YARN-11530
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: ConfX
> Priority: Critical
> Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> h2. What happened?
> Got an IOException stating "Too many open files" when running
> org.apache.hadoop.yarn.TestRPCFactories#test
> h2. Where's the bug?
> In the constructor of org.apache.hadoop.ipc.Server$Listener, the listener
> opens a bunch of readers:
> {code:java}
> readers = new Reader[readThreads];
> for (int i = 0; i < readThreads; i++) {
> Reader reader = new Reader(
> "Socket Reader #" + (i + 1) + " for port " + port);
> readers[i] = reader;
> reader.start();
> }
> {code}
> without checking on the value readThreads. When the parameter
> ipc.server.read.threadpool.size is set big enough, the system would run out
> of new readers to open. The listener should try to catch exceptions thrown
> during the creation of the reader.
> h3. Stacktrace
> {code}
> java.lang.ExceptionInInitializerError
> ...
> Caused by: java.io.IOException: Too many open files
> at java.base/sun.nio.ch.FileDispatcherImpl.init(Native Method)
> at
> java.base/sun.nio.ch.FileDispatcherImpl.<clinit>(FileDispatcherImpl.java:38)
> ...
> {code}
> h2. How to reproduce?
> (1) set ipc.server.read.threadpool.size to 50000
> (2) run org.apache.hadoop.yarn.TestRPCFactories#test
> You can use the reproduce.sh in the attachment to easily reproduce the bug:
> We have tested this bug on both Ubuntu and MacOS. *The bug is volatile and
> appears in different forms on the two OS we have tested*. On MacOS it outputs
> the too many open files error in stderr. On Ubuntu the JVM crashes directly:
> {code}
> [WARNING] Corrupted STDOUT by directly writing to native stream in forked JVM
> 1.
> ...
> ExecutionException The forked VM terminated without properly saying goodbye.
> VM crash or System.exit called?
> ...
> Error occurred in starting fork, check output in log
>
>
> Process Exit Code: 1
>
>
> Crashed tests:
>
>
> org.apache.hadoop.yarn.TestRPCFactories
> ...
> Caused by: org.apache.maven.surefire.booter.SurefireBooterForkException: The
> forked VM terminated without properly saying goodbye. VM crash or System.exit
> called?
> {code}
> We are happy to provide a patch after this issue is confirmed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]