[
https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459698#comment-16459698
]
Billie Rinaldi commented on YARN-8079:
--------------------------------------
Hi [~suma.shivaprasad], thanks for picking this up. The patch is looking good
overall; I only have a couple of small comments.
It looks like we aren't checking whether srcFile is a directory in
AbstractClientProvider (we're only checking whether the file exists). It would
be good to add that check to AbstractClientProvider so that it is being
validated at app submission time.
The comments added to createConfigFileAndAddLocalResource are not quite
accurate. It says: "When source file not specified, upload new configs to
compInstanceDir/fileName. Otherwise, use sourceFile." and "Output properties to
sourceFile if not existed." The method always writes the configs to
compInstanceDir/fileName. When srcFile isn't specified, it writes new configs.
When srcFile is specified, it reads the srcFile, performs variable substitution
and merges in new configs, and writes a new file to compInstanceDir/fileName.
> Support specify files to be downloaded (localized) before containers launched
> by YARN
> -------------------------------------------------------------------------------------
>
> Key: YARN-8079
> URL: https://issues.apache.org/jira/browse/YARN-8079
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Wangda Tan
> Assignee: Suma Shivaprasad
> Priority: Critical
> Attachments: YARN-8079.001.patch, YARN-8079.002.patch,
> YARN-8079.003.patch, YARN-8079.004.patch, YARN-8079.005.patch,
> YARN-8079.006.patch, YARN-8079.007.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly
> read srcFile, instead it always construct {{remoteFile}} by using
> componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code}
> To me it is a common use case which services have some files existed in HDFS
> and need to be localized when components get launched. (For example, if we
> want to serve a Tensorflow model, we need to localize Tensorflow model
> (typically not huge, less than GB) to local disk. Otherwise launched docker
> container has to access HDFS.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]