[ 
https://issues.apache.org/jira/browse/YARN-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889782#comment-13889782
 ] 

Zhijie Shen commented on YARN-1578:
-----------------------------------

[~sinchii], thanks for the patch, which I thinks has found the buggy code to 
fix. I've some comments on the fix:

1. Please refer to getContainer(). We should still merge partial information 
into container history data object when we only have the start/finish data. The 
get APIs should be tolerant for missing information. 

2. getApplicationAttempts need to be fixed as well.

3. Please create some test cases to imitate missing partial data, and the get 
APIs still work.

> Fix how to read history file in FileSystemApplicationHistoryStore
> -----------------------------------------------------------------
>
>                 Key: YARN-1578
>                 URL: https://issues.apache.org/jira/browse/YARN-1578
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: YARN-321
>            Reporter: Shinichi Yamashita
>            Assignee: Shinichi Yamashita
>         Attachments: YARN-1578-2.patch, YARN-1578.patch, 
> application_1390978867235_0001, resoucemanager.log, screenshot.png, 
> screenshot2.pdf
>
>
> I carried out PiEstimator job at Hadoop cluster which applied YARN-321.
> After the job end and when I accessed Web UI of HistoryServer, it displayed 
> "500". And HistoryServer daemon log was output as follows.
> {code}
> 2014-01-09 13:31:12,227 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: 
> /applicationhistory/appattempt/appattempt_1389146249925_0008_000001
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> (snip...)
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.mergeContainerHistoryData(FileSystemApplicationHistoryStore.java:696)
>         at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getContainers(FileSystemApplicationHistoryStore.java:429)
>         at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainers(ApplicationHistoryManagerImpl.java:201)
>         at 
> org.apache.hadoop.yarn.server.webapp.AppAttemptBlock.render(AppAttemptBlock.java:110)
> (snip...)
> {code}
> I confirmed that there was container which was not finished from 
> ApplicationHistory file.
> In ResourceManager daemon log, ResourceManager reserved this container, but 
> did not allocate it.
> When FileSystemApplicationHistoryStore reads container information without 
> finish data in history file, this problem occurs.
> In consideration of the case which there is not finish data, we should fix 
> how to read history file in FileSystemApplicationHistoryStore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to