[ 
https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795461#comment-13795461
 ] 

Omkar Vinit Joshi commented on YARN-1185:
-----------------------------------------

I think it would be fair to assume that rename operation is atomic in nature 
and we can split the existing writeFile operation into two calls
* First write the data to .tmp file
* rename it to actual file.

Similarly when we are loading the state if we encounter any file with ".tmp" 
extension then we will discard it. Attaching the patch which does the same 
thing. Let me know your thoughts.

> FileSystemRMStateStore can leave partial files that prevent subsequent 
> recovery
> -------------------------------------------------------------------------------
>
>                 Key: YARN-1185
>                 URL: https://issues.apache.org/jira/browse/YARN-1185
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Assignee: Omkar Vinit Joshi
>         Attachments: YARN-1185.1.patch
>
>
> FileSystemRMStateStore writes directly to the destination file when storing 
> state. However if the RM were to crash in the middle of the write, the 
> recovery method could encounter a partially-written file and either outright 
> crash during recovery or silently load incomplete state.
> To avoid this, the data should be written to a temporary file and renamed to 
> the destination file afterwards.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to