[ https://issues.apache.org/jira/browse/YARN-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628380#comment-16628380 ]
Weiwei Yang edited comment on YARN-5939 at 9/26/18 8:03 AM: ------------------------------------------------------------ Hi [~bibinchundatt] Let me make sure I understand your question. Lets assume {{ResourceLocalizationService#PublicLocalizer}} has configured thread pool size 4, then there are 4 {{FSDownload}} threads doing the localization concurrently. Each of such thread will be initiated with a new {{FSDownload}} instance {code:java} pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath, resource, request.getContext().getStatCache())), request); {code} then each FSDownload instance will take care of downloading resources from a file system. When it starts to download, e.g calling {{downloadAndUnpack}}. It firstly gets a {{FileSystem}} instance, and normally it creates a few {{DFSOutputStream}} for I/O operations on certain files. After download is accomplished, calling #close like what added in this patch will close those streams. So it is supposed to only close streams for a certain {{FSDownload}} thread, not others. Does that make sense to you? was (Author: cheersyang): Hi [~bibinchundatt] Let me make sure I understand your question. Lets assume {{ResourceLocalizationService#PublicLocalizer}} has configured thread pool size 4, then there are 4 {{FSDownload}} threads doing the localization concurrently. Each of such thread will be initiated with a new {{FSDownload}} instance {code:java} pending.put(queue.submit(new FSDownload(lfs, null, conf, publicDirDestPath, resource, request.getContext().getStatCache())), request); {code} then each FSDownload instance will take care of downloading resources from a file system. When it starts to download, e.g calling {{downloadAndUnpack.}}It firstly gets a {{FileSystem}} instance, and normally it creates a few {{DFSOutputStream}} for I/O operations on certain files. After download is accomplished, calling #close like what added in this patch will close those streams. So it is supposed to only close streams for a certain {{FSDownload}} thread, not others. Does that make sense to you? > FSDownload leaks FileSystem resources > ------------------------------------- > > Key: YARN-5939 > URL: https://issues.apache.org/jira/browse/YARN-5939 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.5.1, 2.7.3 > Reporter: liuxiangwei > Assignee: Weiwei Yang > Priority: Major > Labels: leak > Attachments: YARN-5939.004.patch, YARN-5939.005.patch, > YARN-5939.01.patch, YARN-5939.02.patch, YARN-5939.03.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > Background > To use our self-defined FileSystem class, the item of configuration > "fs.%s.impl.disable.cache" should set to true. > In YARN's source code, the class named > "org.apache.hadoop.yarn.util.FSDownload" use getFileSystem but never close, > which leading to file descriptor leak because our self-defined FileSystem > class close the file descriptor when the close function is invoked. > My Question below: > 1. whether invoking "getFileSystem" but never close is YARN's expected > behavior > 2. what should we do in our self-defined FileSystem resolve it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org