[ 
https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169753#comment-15169753
 ] 

Naganarasimha G R commented on YARN-4736:
-----------------------------------------

Thanks for the analysis [~sjlee0],
bq. Both the thread dump and the HBase exception log are from the client 
process (NM side), correct? 
Yes both are client side exceptions (i.e. NM) but i am not sure issue 2 has 
relationship with issue 1 but based on your explanation it seems to be related. 

bq. some time after that, it looks like you issued a signal to stop the client 
process (NM)?
Yes,  after a significant amount of time  after the completion of the job ( 
00:02:28)  error log came (00:39:03), and there was significant time after 
which i had tried to stop the NM @ 01:09:19.  And even when try to stop the NM 
immediately after the job completion i am able to see this issue (NM not 
stopping completely)

bq. That's why I thought there seems to be a HBase bug that is causing the 
flush operation to be wedged in this state. At least that explains why you were 
not able to shut down the collector (and therefore NM).
Yes may be, but to confirm is the server side logs required ? I am using *HBase 
- Version 1.0.2* . Can share if any other information required too. cc/ 
[[email protected]]. 




> Issues with HBaseTimelineWriterImpl
> -----------------------------------
>
>                 Key: YARN-4736
>                 URL: https://issues.apache.org/jira/browse/YARN-4736
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Naganarasimha G R
>            Assignee: Vrushali C
>            Priority: Critical
>              Labels: yarn-2928-1st-milestone
>         Attachments: hbaseException.log, threaddump.log
>
>
> Faced some issues while running ATSv2 in single node Hadoop cluster and in 
> the same node had launched Hbase with embedded zookeeper.
> # Due to some NPE issues i was able to see NM was trying to shutdown, but the 
> NM daemon process was not completed due to the locks.
> # Got some exception related to Hbase after application finished execution 
> successfully. 
> will attach logs and the trace for the same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to