Steve Loughran created YARN-10444:
-------------------------------------
Summary: use openFile() with sequential IO for localizing files.
Key: YARN-10444
URL: https://issues.apache.org/jira/browse/YARN-10444
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 3.3.0
Environment: echo set the office to blue scene
Reporter: Steve Loughran
Assignee: Steve Loughran
HADOOP-16202 adds standard options for declaring the read/seek
Policy when reading a file. These should be set to sequential IO
When localising resources, so that if the default/cluster settings
For a file system are optimized for random IO, artifact downloads
are still read at the maximum speed possible (one big GET to the EOF).
Most of this happens in hadoop-common, but some tuning of FSDownload
can assist
* tar/jar download must also be sequential
* if the FileStatus is passed around, that can be used
in the open request to skip checks when loading the file.
Together this can save 3 HEAD requests per resource, with the sequential
IO avoiding any splitting of the big read into separate block GETs
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]