[
https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198175#comment-15198175
]
Sangjin Lee commented on YARN-4736:
-----------------------------------
Yes, sorry I meant HBASE-15436.
I think we're more OK with the situation where the entire HBase cluster is down
or the master is down. That's a critical situation, and all bets are off at
that point.
My concern is more if one region server went down or is in a state where it
times out writes and your {{BufferedMutatorImpl}} needs to flush to it. If that
flush operation times out after 30+ minutes, that would be a significant
problem. [~anoop.hbase], would things take 30+ minutes to time out if a region
server (rather than the cluster itself) is down or misbehaving? Thoughts?
> Issues with HBaseTimelineWriterImpl
> -----------------------------------
>
> Key: YARN-4736
> URL: https://issues.apache.org/jira/browse/YARN-4736
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Naganarasimha G R
> Assignee: Vrushali C
> Priority: Critical
> Labels: yarn-2928-1st-milestone
> Attachments: NM_Hang_hbase1.0.3.tar.gz, hbaseException.log,
> threaddump.log
>
>
> Faced some issues while running ATSv2 in single node Hadoop cluster and in
> the same node had launched Hbase with embedded zookeeper.
> # Due to some NPE issues i was able to see NM was trying to shutdown, but the
> NM daemon process was not completed due to the locks.
> # Got some exception related to Hbase after application finished execution
> successfully.
> will attach logs and the trace for the same
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)