[jira] [Commented] (YARN-6027) Improve /flows API for more flexible filters fromid, collapse, userid

Rohith Sharma K S (JIRA) Tue, 07 Feb 2017 18:41:48 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857267#comment-15857267
 ]


Rohith Sharma K S commented on YARN-6027:
-----------------------------------------

Thanks [~varun_saxena] for the review.. 
bq.Do we need cluster ID in fromId because we are ignoring it completely?
Yes, it is required even though it is ignored, considering when fromId is being 
used. Do not want user to parse something and provide it as fromId. User can 
directly provide flow entity ID as fromId. Lets reader server handles it. 
Cluster Id check can be done to verify context cluster and from clusterId are 
equal. Ideally both should match. Otherwise we can throw exception.

bq. If there is a / in cluster ID we may have to escape it to avoid parsing 
errors.
If need to parse the errors, then why flow entity id is providing full row key 
as id? I think need to change flow entity id format itself. 

bq. If we use collapse, even with fromId, there seems to be a full table scan 
which will impact
Yes, it does table scan. But it is expected to collapse with date range 
otherwise default behavior of  /flows should be changed to give one day flows 
rather than full table data. It is a engineering issue, and may be can mention 
like performance will be bit slow. 

bq. Maybe we can send the last real ID in info field of last flow activity 
entity if previous query was made with collapse field
Initially idea was to send last real id as fromId field info. But flows are 
stored per day for each user which not useful. Note that when collapse is used, 
we must scan to get all entities and apply fromId. Scanning can't be done half 
the way which end up in redundant entries for the user. Given previous comment 
is satisfied this should not be an issue. 

bq. you have mentioned that fromId validation is happening in getResult method. 
Could not find it
ahh, I think I have missed it at global level. I have validating in one 
condition. Will validate at global level.

bq. In processResults we first get the result from backend while applying limit 
and then process result for collapse and fromId filters.
If you look at the patch, I have removed PageFilter while scanning which gives 
all the data. One optimization I can do is PageFilter can be applied in 
non-collapse mode because in non collapse mode scanning will start from given 
fromId. But the same logic can not be used for collapse mode. 

> Improve /flows API for more flexible filters fromid, collapse, userid
> ---------------------------------------------------------------------
>
>                 Key: YARN-6027
>                 URL: https://issues.apache.org/jira/browse/YARN-6027
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>              Labels: yarn-5355-merge-blocker
>         Attachments: YARN-6027-YARN-5355.0001.patch
>
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-6027) Improve /flows API for more flexible filters fromid, collapse, userid

Reply via email to