[
https://issues.apache.org/jira/browse/YARN-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhankun Tang updated YARN-9083:
-------------------------------
Description:
When refining YARN-8714, found that the YARN localizer seems can handle remote
directory directly. In FSDownload.java#downloadAndUnpack, it uses
"FileUtil.copy" which can handle directory. This ability is added by YARN-2185.
For testing purpose, I changed distributedShell's client to let it localize an
HDFS directory "mydir" directly.
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
"/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
LocalResource.newInstance(URL.fromURI(p.toUri()),
LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py 2.py dir1 test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/
-l
total 20
lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar ->
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir ->
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and
blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient:
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
is a directory, which is not supported.{code}
We should enable this ability in yarn native service.
was:
When refining YARN-8714, found that the YARN localizer seems can handle remote
directory directly. In FSDownload.java#downloadAndUnpack, it uses
"FileUtil.copy" which can handle directory. This ability is added by YARN-2185.
For testing purpose, I changed distributedShell's client to let it localize an
HDFS directory "mydir" directly.
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
"/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
LocalResource.newInstance(URL.fromURI(p.toUri()),
LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py 2.py dir1 test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/
-l
total 20
lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar ->
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir ->
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and
blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient:
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
is a directory, which is not supported.{code}
We should utilize this ability in yarn native service.
> Support remote directory localization in yarn native service
> ------------------------------------------------------------
>
> Key: YARN-9083
> URL: https://issues.apache.org/jira/browse/YARN-9083
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Zhankun Tang
> Assignee: Zhankun Tang
> Priority: Major
>
> When refining YARN-8714, found that the YARN localizer seems can handle
> remote directory directly. In FSDownload.java#downloadAndUnpack, it uses
> "FileUtil.copy" which can handle directory. This ability is added by
> YARN-2185.
> For testing purpose, I changed distributedShell's client to let it localize
> an HDFS directory "mydir" directly.
> {code:java}
> Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
> "/mydir");
> FileStatus scFileStatus = fs.getFileStatus(p);
> LocalResource r =
> LocalResource.newInstance(URL.fromURI(p.toUri()),
> LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
> scFileStatus.getLen(), scFileStatus.getModificationTime());
> localResources.put("mydir", r);{code}
> And YARN localizer indeed downloads the HDFS dir to local for
> DistributedShell.
> {code:java}
> yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls
> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
> 1.py 2.py dir1 test_kill9.sh
> yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls
> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/
> -l
> total 20
> lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar ->
> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
> lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir ->
> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
> {code}
> But the YARN native service seems doesn't know this YARN localizer ability
> and blocked it.
> {code:java}
> 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient:
> srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
> is a directory, which is not supported.{code}
> We should enable this ability in yarn native service.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]