[
https://issues.apache.org/jira/browse/YARN-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628380#comment-16628380
]
Weiwei Yang edited comment on YARN-5939 at 9/26/18 8:03 AM:
------------------------------------------------------------
Hi [~bibinchundatt]
Let me make sure I understand your question. Lets assume
{{ResourceLocalizationService#PublicLocalizer}} has configured thread pool size
4, then there are 4 {{FSDownload}} threads doing the localization concurrently.
Each of such thread will be initiated with a new {{FSDownload}} instance
{code:java}
pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath,
resource, request.getContext().getStatCache())), request);
{code}
then each FSDownload instance will take care of downloading resources from a
file system. When it starts to download, e.g calling {{downloadAndUnpack}}. It
firstly gets a {{FileSystem}} instance, and normally it creates a few
{{DFSOutputStream}} for I/O operations on certain files. After download is
accomplished, calling #close like what added in this patch will close those
streams. So it is supposed to only close streams for a certain {{FSDownload}}
thread, not others. Does that make sense to you?
was (Author: cheersyang):
Hi [~bibinchundatt]
Let me make sure I understand your question. Lets assume
{{ResourceLocalizationService#PublicLocalizer}} has configured thread pool size
4, then there are 4 {{FSDownload}} threads doing the localization concurrently.
Each of such thread will be initiated with a new {{FSDownload}} instance
{code:java}
pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath,
resource, request.getContext().getStatCache())), request);
{code}
then each FSDownload instance will take care of downloading resources from a
file system. When it starts to download, e.g calling {{downloadAndUnpack.}}It
firstly gets a {{FileSystem}} instance, and normally it creates a few
{{DFSOutputStream}} for I/O operations on certain files. After download is
accomplished, calling #close like what added in this patch will close those
streams. So it is supposed to only close streams for a certain {{FSDownload}}
thread, not others. Does that make sense to you?
> FSDownload leaks FileSystem resources
> -------------------------------------
>
> Key: YARN-5939
> URL: https://issues.apache.org/jira/browse/YARN-5939
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.5.1, 2.7.3
> Reporter: liuxiangwei
> Assignee: Weiwei Yang
> Priority: Major
> Labels: leak
> Attachments: YARN-5939.004.patch, YARN-5939.005.patch,
> YARN-5939.01.patch, YARN-5939.02.patch, YARN-5939.03.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Background
> To use our self-defined FileSystem class, the item of configuration
> "fs.%s.impl.disable.cache" should set to true.
> In YARN's source code, the class named
> "org.apache.hadoop.yarn.util.FSDownload" use getFileSystem but never close,
> which leading to file descriptor leak because our self-defined FileSystem
> class close the file descriptor when the close function is invoked.
> My Question below:
> 1. whether invoking "getFileSystem" but never close is YARN's expected
> behavior
> 2. what should we do in our self-defined FileSystem resolve it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]