[ 
https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054308#comment-15054308
 ] 

Varun Saxena commented on YARN-3863:
------------------------------------

[~sjlee0], [~djp], kindly review.
The latest WIP patch includes the following over 1st WIP patch.

# Relationships(relatesTo/isRelatedTo) and event filters represented as 
timeline filter list in addition to info, config and metric filters. So that 
queries around ANDs' and ORs' can be supported even for them.
# Events in entity and application table are represented as under:
The column qualifier is of the form {{e!\[eventid\]=\[event_timestamp\]=\{event 
info key\}}} . Info key is part of column qualifier only if event info exists. 
The value associated with column qualifier is info value. If no info exists, it 
will be empty.
Now to match event filters which check existence of a particular event, with 
this arrangement we do not really have an analogous HBase filter which can 
filter out rows for us(with complex filters containing ANDs' and ORs'). So 
event filters will be applied in timeline reader after fetching rows from HBase.
What we can do however to reduce amount of data to fetch from HBase is that we 
fetch only those columns which are required for matching event filters. This is 
what is done in the patch. We use QualifierFilter to achieve this...
Pls note we do this only if fields to retrieve does not contain EVENTS. Because 
then all events will have to be fetched.
# Now coming to relationships(isRelatedTo and relatesTo), they are stored as 
under :
Column qualifier is {{r!\[entitytype\]}} or {{s!\[entitytype\]}} and associated 
value is stored as a list entity ids' separated by = i.e. like, 
{{entityid1=entityid2=entityid3}}
The way value is stored makes it difficult to use SingleColumnValueFilter. We 
can probably use regex comparator but making regex dynamically based on query 
on the fly may not be feasible and anyways make matching slow at HBase side.
So even here we fetch only the required columns like we do for event filters.

Also Naga told me that in the meeting you wanted reader API to be refactored as 
well.
I have that at the back of my mind. I think as this patch by itself is quite 
large, we can do that refactoring in another JIRA. Or do you want to do it here 
?
I have to raise a few JIRAs' including this refactoring one. Its at the back of 
my mind.

> Enhance filters in TimelineReader
> ---------------------------------
>
>                 Key: YARN-3863
>                 URL: https://issues.apache.org/jira/browse/YARN-3863
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3863-feature-YARN-2928.wip.003.patch, 
> YARN-3863-feature-YARN-2928.wip.01.patch, 
> YARN-3863-feature-YARN-2928.wip.02.patch
>
>
> Currently filters in timeline reader will return an entity only if all the 
> filter conditions hold true i.e. only AND operation is supported. We can 
> support OR operation for the filters as well. Additionally as primary backend 
> implementation is HBase, we can design our filters in a manner, where they 
> closely resemble HBase Filters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to