[ 
https://issues.apache.org/jira/browse/YARN-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724186#comment-13724186
 ] 

rohithsharma commented on YARN-403:
-----------------------------------

Hash verification is one authentication process at Reducer/NM for Http 
request/response i.e fetching map output from NodeManager's. 

One specific scenario I have observed above exception is 
     bq. When app master is killed after map phase is completed. 2nd app master 
attempt start reducer phase which intern reducers request for fetching map 
output. 

In the above scenario, 
1. 1st attemp AM has SecretKey.This key has sent to Node Manager during start 
of container(Map Task). Node Manager stores in memory i.e 
*ShuffleHandler.secretManager* which will be removed only after application is 
finished. 

After Map Task is completed, app master is killed.

2. When 2nd attempt app master started, new SecretKey is generated. This 
Secretkey is sent along with startContainer request to NodeManager and Reducer 
JVM is started by NM. Fetcher at reducer encrypt request url using SecretKey 
and sends fetch request to Node Manager where Map Task has run (Old SecretKey 
is present). 
At NodeManager , hash verification is verified with old 1st attempt app master 
SecretKey. This verification fails since keys are different.
                
> Node Manager throws java.io.IOException: Verification of the hashReply failed
> -----------------------------------------------------------------------------
>
>                 Key: YARN-403
>                 URL: https://issues.apache.org/jira/browse/YARN-403
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.2-alpha, 0.23.6
>            Reporter: Devaraj K
>            Assignee: Omkar Vinit Joshi
>
> {code:xml}
> 2013-02-09 22:59:47,490 WARN org.apache.hadoop.mapred.ShuffleHandler: Shuffle 
> failure 
> java.io.IOException: Verification of the hashReply failed
>       at 
> org.apache.hadoop.mapreduce.security.SecureShuffleUtils.verifyReply(SecureShuffleUtils.java:98)
>       at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:436)
>       at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:383)
>       at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
>       at 
> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:148)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
>       at 
> org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116)
>       at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302)
>       at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:522)
>       at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:506)
>       at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443)
>       at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
>       at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
>       at 
> org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
>       at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
>       at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>       at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to