[
https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177509#comment-15177509
]
Varun Saxena commented on YARN-3863:
------------------------------------
Thanks [~sjlee0] for the review.
bq. One high level question: am I correct in understanding that if a relations
filter is specified for example but relation was not specified as part of
fields to retrieve, we would try to fetch the relation?
Yes, we would try to fetch only those relations which are required to match the
relation filters. Same goes for event filters. We will try to fetch only those
events which are required to match event filters if fields to retrieve does not
specify EVENTS.
bq. What if we simply reject or ignore the filters if they do not match the
fields to retrieve? Would it make the implementation simpler or harder?
It will preclude the need of some of the code in GenericEntityReader and
ApplicationEntity i.e. primarily code in method
{{fetchPartialColsFromInfoFamily}} and {{createFilterListForColsOfInfoFamily}}.
bq. To me, supporting more contents even if the filters and the fields to
retrieve are not consistent seems very much optional, and I'm not sure if it is
worth it especially if it adds a lot more complexity. What do you think?
Personally I think fields to retrieve and filters should be treated separately.
Filters decide which entities to carry back in response and
fields/configs/metrics to retrieve decide what should be carried in each entity.
Treating filters and fields to retrieve is consistent with code written
previously in the branch but as this is new code we can change the behavior
too. But I am not very sure if we should do so.
For instance, if I want to get IDs' of all the FINISHED apps, I can make a
query with eventfilters as APPLICATION_FINISHED and not specify anything in
fields to retrieve as I am only interested in application ID. If I link it to
fields to retrieve, I will have to unnecessarily fetch other events as well,
which I have no interest in. This increases the amount of bytes transferred
across the wire as well. Moreover, info also has associated info as well.
Maybe along the lines of confs/metrics to retrieve we can have something like
events to retrieve as well but in all these cases one query param is depending
on other which doesn't sound right to me.
Thoughts ?
We can discuss further on this in today's meeting.
bq. I know Vrushali C had some thoughts on how to split this monolithic
TestHBaseTimelineStorage. It might be good to come to a consensus on how to
split it...
Ok. I had split it across apps and entities. We can seek her opinion too on
this in today's meeting.
I will check other comments when I start coding for next version of patch. Most
sound like they would be valid and fixable.
> Support complex filters in TimelineReader
> -----------------------------------------
>
> Key: YARN-3863
> URL: https://issues.apache.org/jira/browse/YARN-3863
> Project: Hadoop YARN
> Issue Type: Sub-task
> Affects Versions: YARN-2928
> Reporter: Varun Saxena
> Assignee: Varun Saxena
> Labels: yarn-2928-1st-milestone
> Attachments: YARN-3863-YARN-2928.v2.01.patch,
> YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch,
> YARN-3863-feature-YARN-2928.wip.003.patch,
> YARN-3863-feature-YARN-2928.wip.01.patch,
> YARN-3863-feature-YARN-2928.wip.02.patch,
> YARN-3863-feature-YARN-2928.wip.04.patch,
> YARN-3863-feature-YARN-2928.wip.05.patch
>
>
> Currently filters in timeline reader will return an entity only if all the
> filter conditions hold true i.e. only AND operation is supported. We can
> support OR operation for the filters as well. Additionally as primary backend
> implementation is HBase, we can design our filters in a manner, where they
> closely resemble HBase Filters.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)