[ https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sangjin Lee updated YARN-4074: ------------------------------ Attachment: YARN-4074-YARN-2928.POC.003.patch POC v.3 patch posted. Key changes include - switched from Get.setMaxResultSize() to PageFilter (more on that below) - major refactoring of HBaseTimelineReaderImpl -- introduced TimelineEntityReader and the hierarchy of classes to isolate proper reading per type - added unit tests to test HBaseTimelineReaderImpl for flow activity and flow runs - fixed an issue with FlowScanner where the cells were returned in the wrong order so it was breaking Column.readResult() - made *RowKey classes real object classes, and added the parseRowKey method that returns an instance of the RowKey - fixed the order of the add and pollLast - renamed FlowEntity to FlowRunEntity - added the compareTo() method for FlowActivityEntity - passed the type into the FlowActivityEntity constructor - set configs for FlowActivityEntity and FlowRunEntity to null - improved the way we get string values from info for FlowActivityEntity and FlowRunEntity - added getNumberOfRuns() to FlowActivityEntity It is actually pretty close to being ready, but since YARN-3901 is still outstanding, I'm not making it an official patch yet. As for the PageFilter issue, I concluded setMaxResultSize() is not the right API to use to limit the number of rows. I believe the PageFilter is the right thing to use. I also added the counting logic to get the right number of records even if the result iterator advances. As for the FlowScanner issue mentioned above, [~vrushalic] and [~jrottinghuis] debugged this to track down a bug in YARN-3901. As such, this change will likely be made in the final YARN-3901 patch. I just included it here for completeness and to make the unit code pass. You should be able to apply the YARN-3901 v.3 patch and then this patch cleanly. Let me know if you have any questions. I'd greatly appreciate review feedback. I understand it's a lot of code... > [timeline reader] implement support for querying for flows and flow runs > ------------------------------------------------------------------------ > > Key: YARN-4074 > URL: https://issues.apache.org/jira/browse/YARN-4074 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Sangjin Lee > Assignee: Sangjin Lee > Attachments: YARN-4074-YARN-2928.POC.001.patch, > YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch > > > Implement support for querying for flows and flow runs. > We should be able to query for the most recent N flows, etc. > This includes changes to the {{TimelineReader}} API if necessary, as well as > implementation of the API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)