[ 
https://issues.apache.org/jira/browse/YARN-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628380#comment-16628380
 ] 

Weiwei Yang edited comment on YARN-5939 at 9/26/18 8:03 AM:
------------------------------------------------------------

Hi [~bibinchundatt]

Let me make sure I understand your question. Lets assume 
{{ResourceLocalizationService#PublicLocalizer}} has configured thread pool size 
4, then there are 4 {{FSDownload}} threads doing the localization concurrently. 
Each of such thread will be initiated with a new {{FSDownload}} instance
{code:java}
pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath, 
resource, request.getContext().getStatCache())), request);
{code}
then each FSDownload instance will take care of downloading resources from a 
file system. When it starts to download, e.g calling {{downloadAndUnpack}}. It 
firstly gets a {{FileSystem}} instance, and normally it creates a few 
{{DFSOutputStream}} for I/O operations on certain files. After download is 
accomplished, calling #close like what added in this patch will close those 
streams. So it is supposed to only close streams for a certain {{FSDownload}} 
thread, not others. Does that make sense to you?


was (Author: cheersyang):
Hi [~bibinchundatt]

Let me make sure I understand your question. Lets assume 
{{ResourceLocalizationService#PublicLocalizer}} has configured thread pool size 
4, then there are 4 {{FSDownload}} threads doing the localization concurrently. 
Each of such thread will be initiated with a new {{FSDownload}} instance
{code:java}
pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath, 
resource, request.getContext().getStatCache())), request);
{code}
then each FSDownload instance will take care of downloading resources from a 
file system. When it starts to download, e.g calling {{downloadAndUnpack.}}It 
firstly gets a {{FileSystem}} instance, and normally it creates a few 
{{DFSOutputStream}} for I/O operations on certain files. After download is 
accomplished, calling #close like what added in this patch will close those 
streams. So it is supposed to only close streams for a certain {{FSDownload}} 
thread, not others. Does that make sense to you?

> FSDownload leaks FileSystem resources
> -------------------------------------
>
>                 Key: YARN-5939
>                 URL: https://issues.apache.org/jira/browse/YARN-5939
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.5.1, 2.7.3
>            Reporter: liuxiangwei
>            Assignee: Weiwei Yang
>            Priority: Major
>              Labels: leak
>         Attachments: YARN-5939.004.patch, YARN-5939.005.patch, 
> YARN-5939.01.patch, YARN-5939.02.patch, YARN-5939.03.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Background
> To use our self-defined FileSystem class, the item of configuration 
> "fs.%s.impl.disable.cache" should set to true.
> In YARN's source code, the class named 
> "org.apache.hadoop.yarn.util.FSDownload" use getFileSystem but never close, 
> which leading to file descriptor leak because our self-defined FileSystem 
> class close the file descriptor when the close function is invoked.
> My Question below:
> 1. whether invoking "getFileSystem" but never close is YARN's expected 
> behavior 
> 2. what should we do in our self-defined FileSystem resolve it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to