Angus Davis created YARN-6725:
---------------------------------
Summary: NodeManager cannot localize files when URL contains
opaque authority
Key: YARN-6725
URL: https://issues.apache.org/jira/browse/YARN-6725
Project: Hadoop YARN
Issue Type: Bug
Components: yarn
Affects Versions: 2.8.1
Reporter: Angus Davis
Priority: Minor
When presented with a URI or URL that has a valid structure, but where the URI
/ URL authority cannot be interpreted as a valid hostname, the java URI#getHost
method will return null. In the YARN api records the URL message is decomposed
into hostname, username, port, etc, but it does not provide a means to
transport an opaque authority. This becomes problematic when using object
stores or other file systems that use something other than a hostname as an
authority.
In our particular case, Google Cloud Storage allows buckets to be named with
underscores which is causing the NodeManagers to attempt to localize resources
of the form 'gs:///path/to/object' when they should be attempting to localize
'gs://bucket/path/to/object'. Components that are written in terms of
o.a.h.fs.Path instances seem to handle this case properly, as do most Path
utility methods (toUri, etc).
It seems that by transporting the authority along with the host, user info,
etc, a YARN api URL can properly represent o.a.h.fs.Path instances of this
particular form.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]