[ https://issues.apache.org/jira/browse/YARN-9374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867631#comment-16867631 ]
Szilard Nemeth commented on YARN-9374: -------------------------------------- Hi [~Prabhu Joseph]! Thanks for this patch! Just a minor comment: {code:java} HBaseTimelineWriterImpl hbi = new HBaseTimelineWriterImpl(); {code} Could you call hbi as writer instead? One more thing: I can understand what happens here: {code:java} util.shutdownMiniHBaseCluster(); GenericTestUtils.waitFor(() -> hbi.isHBaseDown(), 1000, 100000); boolean exceptionCaught = false; try{ hbi.write(new TimelineCollectorContext("ATS1", "user1", "flow2", "AB7822C10F1111", 1002345678919L, appId), te, UserGroupInformation.createRemoteUser("user1")); } catch (Exception e) { exceptionCaught = true; } assertTrue("HBaseStorageMonitor failed to detect HBase Down", exceptionCaught); {code} So here, you are expecting an exception because you call write on the writer and HBase is down. Right after this code block, you have: {code:java} util.startMiniHBaseCluster(1, 1); GenericTestUtils.waitFor(() -> !hbi.isHBaseDown(), 1000, 100000); try { hbi.write(new TimelineCollectorContext("ATS", "user1", "flow3", "AB7822C10F1111", 1002345678919L, appId), te, UserGroupInformation.createRemoteUser("user1")); } catch (Exception e) { Assert.fail("HbaseStorageMonitor failed to detect HBase Up"); } {code} I don't really get this. You are simulating HBase is up again, then trying to write a timeline entry. But if an exception is thrown, I don't think you can be sure that is because the HbaseStorageMonitor failed to detect HBase is up again, so I would rethink the assertion message. What was your intention here? Thanks! > HBaseTimelineWriterImpl sync writes has to avoid thread blocking if storage > down > -------------------------------------------------------------------------------- > > Key: YARN-9374 > URL: https://issues.apache.org/jira/browse/YARN-9374 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2 > Affects Versions: 3.2.0 > Reporter: Prabhu Joseph > Assignee: Prabhu Joseph > Priority: Major > Attachments: YARN-9374-001.patch, YARN-9374-002.patch, > YARN-9374-003.patch > > > HBaseTimelineWriterImpl sync writes has to avoid thread blocking if storage > is down. Currently we check if hbase storage is down in TimelineReader before > reading entities and fail immediately in YARN-8302. Similar fix is needed for > write. Async is handled in YARN-9335. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org