[
https://issues.apache.org/jira/browse/YARN-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321724#comment-16321724
]
Miklos Szegedi commented on YARN-7712:
--------------------------------------
Thank you for the reply, [~chris.douglas]. The scenario is mainly for testing
and demonstrating the REST API behavior for future users.
Here is the current launch command list when launching an AM from the REST API:
1. The client has to upload a dependency to localize to HDFS
2. The client has to grab the timestamp from HDFS
3. The client runs a job through the rest API specifying the localized file
with the timestamp
The client can run a job faster and with less effort with the suggested change:
1. The client has to upload a jar to HDFS
3. The client runs a job through the rest API specifying the localized file
with ignored timestamp
In my opinion, the timestamp specification requirement has multiple issues.
1. It does not protect security. The client gets the failing timestamp in the
error message
2. It is an annoyance in basic clusters and testing scenarios especially REST
api users
3. The user can restrict the directory where it uploads to in order to protect
consistency
4. The additional hop adds latency that is not necessary in cases 2. and 3.
5. If I had to think about a design to use timestamp to protect consistency, I
would
a) make sure time is trusted in the cluster and modification timestamp is
trusted in HDFS
b) grab a launch timestamp {{tl}} (or desired minimum timestamp), when the
client starts and place it in ContainerLaunchContext just like it is now
c) verify that the file modification time is less than the launch or any
other specified timestamp at localization time {{tm < tl}}.
This would ensure the same level of consistency without additional latency to
REST users through Python for example.
6. The PathHandle that you suggested is a better option, I admit.
> Add ability to ignore timestamps in localized files
> ---------------------------------------------------
>
> Key: YARN-7712
> URL: https://issues.apache.org/jira/browse/YARN-7712
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: nodemanager
> Reporter: Miklos Szegedi
> Assignee: Miklos Szegedi
>
> YARN currently requires and checks the timestamp of localized files and
> fails, if the file on HDFS does not match to the one requested. This jira
> adds the ability to ignore the timestamp based on the request of the client.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]