[
https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187583#comment-16187583
]
Chris Trezzo commented on YARN-7001:
------------------------------------
Looking at this patch, I am not entirely sure if this fixes the issue. I am
thinking about these two scenarios:
# If {{uploadFile}} returns false: {{FileUtil.copy}} has returned false. If we
look into that method, I think the only way it will return false is if the file
has not been created yet, since we pass in deleteSource as false. In this case,
we do not need a deleteTempFile call.
# If {{uploadFile}} throws an IOException: Here we might have an issue. If copy
throws an IOException after it created the tmp file, but before it finished
writing it, we may be stranding tmp files. It seems like we would want a
try/catch around the uploadFile. If we get an IOException, we would want to
delete the tmp file if it exists.
In reality, we could be stranding tmp files if the node manager fails at any
point between the file creation in uploadFile and the file rename later in the
method. In practice, this doesn't seem to be an issue because the time between
those points is small. Maybe we could add a try/finally around those two points
where we attempt to delete the tmp file in the finally? That at least covers
the case where there is an unexpected exception.
Let me know if you think I have missed something. Thanks!
> If shared cache upload is terminated in the middle, the temp file will never
> be deleted
> ---------------------------------------------------------------------------------------
>
> Key: YARN-7001
> URL: https://issues.apache.org/jira/browse/YARN-7001
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Miklos Szegedi
> Assignee: Sen Zhao
> Attachments: YARN-7001.001.patch, YARN-7001.002.patch,
> YARN-7001.003.patch, YARN-7001.004.patch
>
>
> There is a missing deleteTempFile(tempPath);
> {code}
> tempPath = new Path(directoryPath, getTemporaryFileName(actualPath));
> if (!uploadFile(actualPath, tempPath)) {
> LOG.warn("Could not copy the file to the shared cache at " +
> tempPath);
> return false;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]