[ 
https://issues.apache.org/jira/browse/YARN-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960026#comment-13960026
 ] 

Jason Lowe commented on YARN-1901:
----------------------------------

This appears to be a duplicate of HIVE-6638.  As [~ozawa] mentioned, AMs are 
restarted when the RM restarts until YARN-556 is addressed.  When an AM 
restarts, it is not automatically the case that completed tasks will be 
recovered -- it must be supported by the output committer.  HIVE-6638 is 
updating Hive's OutputCommitter so it can support task recovery upon AM restart.

> All tasks restart during RM failover on Hive
> --------------------------------------------
>
>                 Key: YARN-1901
>                 URL: https://issues.apache.org/jira/browse/YARN-1901
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Fengdong Yu
>
> I built from trunk, and configured RM Ha, then I submitted a hive job.
> there are total 11 maps, then I stopped active RM when 6 maps finished.
> but Hive shows me all map tasks restat again. This is conflict with the 
> design description.
> job progress:
> {code}
> 2014-03-31 18:44:14,088 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 
> 713.84 sec
> 2014-03-31 18:44:15,128 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 
> 722.83 sec
> 2014-03-31 18:44:16,160 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 
> 731.95 sec
>  2014-03-31 18:44:17,191 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 
> 744.17 sec
> 2014-03-31 18:44:18,220 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 
> 756.22 sec
> 2014-03-31 18:44:19,250 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 762.4 
> sec
>  2014-03-31 18:44:20,281 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 
> 774.64 sec
> 2014-03-31 18:44:21,306 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 
> 786.49 sec
> 2014-03-31 18:44:22,334 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 
> 792.59 sec
>  2014-03-31 18:44:23,363 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 
> 807.58 sec
> 2014-03-31 18:44:24,392 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU 
> 815.96 sec
> 2014-03-31 18:44:25,416 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 
> 823.83 sec
>  2014-03-31 18:44:26,443 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 
> 826.84 sec
> 2014-03-31 18:44:27,472 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU 
> 832.16 sec
> 2014-03-31 18:44:28,501 Stage-1 map = 84%,  reduce = 0%, Cumulative CPU 
> 839.73 sec
>  2014-03-31 18:44:29,531 Stage-1 map = 86%,  reduce = 0%, Cumulative CPU 
> 844.45 sec
> 2014-03-31 18:44:30,564 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU 
> 760.34 sec
> 2014-03-31 18:44:31,728 Stage-1 map = 0%,  reduce = 0%
>  2014-03-31 18:45:06,918 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 
> 213.81 sec
> 2014-03-31 18:45:07,952 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 216.83 
> sec
> 2014-03-31 18:45:08,979 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 229.15 
> sec
>  2014-03-31 18:45:10,007 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 
> 244.42 sec
> 2014-03-31 18:45:11,040 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU 
> 247.31 sec
> 2014-03-31 18:45:12,072 Stage-1 map = 18%,  reduce = 0%, Cumulative CPU 259.5 
> sec
>  2014-03-31 18:45:13,105 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 274.72 sec
> 2014-03-31 18:45:14,135 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 280.76 sec
> 2014-03-31 18:45:15,170 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 292.9 
> sec
>  2014-03-31 18:45:16,202 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 305.16 sec
> 2014-03-31 18:45:17,233 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 314.21 sec
> 2014-03-31 18:45:18,264 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 323.34 sec
>  2014-03-31 18:45:19,294 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 335.6 sec
> 2014-03-31 18:45:20,325 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 344.71 sec
> 2014-03-31 18:45:21,355 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 353.8 
> sec
>  2014-03-31 18:45:22,385 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 366.06 sec
> 2014-03-31 18:45:23,415 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 375.2 
> sec
> 2014-03-31 18:45:24,449 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 
> 384.28 sec
> {code}
> I am using hive-0.12.0,  and ZKRMStateRoot as RM store class.  Hive using a 
> simple external table(only one column).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to