[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857803#comment-15857803
 ] 

Varun Saxena commented on YARN-6027:
------------------------------------

bq.  But it is expected to collapse with date range
Ok, then its fine. I was thinking we would try to display all the flows (say 10 
on each page) on UI. If it is based on daterange then it should be fine in 
terms of performance.
I guess we will probably display flows for the current day only.
We can probably leave a note in javadoc that a suitable daterange should be 
provided in general for this REST endpoint.

bq.  User can directly provide flow entity ID as fromId.
Ohh you are providing ID itself. Maybe we would like to leave a note in javadoc 
and documentation that cluster part of it will be ignored. And in case of 
collapse, cluster and timestamp will be ignored. In UI case, cluster would be 
same as the one in REST endpoint but you can form fromID manually as well and 
provide a different cluster ID than the one in REST URL path param in that 
case. So we can make the behavior clear.

bq. If need to parse the errors, then why flow entity id is providing full row 
key as id? I think need to change flow entity id format itself.
That is just for read. We do not make any decisions with it. But now we will. 
We can encode or escape cluster and other stuff while creating ID in 
FlowActivityEntity itself but when UI displays it, it may have to unescape it. 
Also we would need to unescape it after splitting fromId. Changing format wont 
make much difference as some delimiter or the other will have to be used and 
that will have to be escaped too. Right? Cluster ID is a plain string and we 
have to assume it can be anything. This would have to be done just to make the 
system more robust even if we are unlikely to have a certain delimiter in 
cluster or elsewhere.

bq. One optimization I can do is PageFilter can be applied in non-collapse mode
Yeah that can be done.

bq. If you look at the patch, I have removed PageFilter while scanning which 
gives all the data. 
Ok...Cant we apply PageFilter in steps in collapse mode? Maybe override 
getResults itself. When we use it with daterange it should be fine but in cases 
where daterange is not specified, this may help. What I mean is get results 
from backend with PageFilter equivalent to limit. Then collapse and go back and 
fetch results again if more records are required(based on limit). Something 
like below. We need to check with however, if PageFilter, with limited but 
possible multiple fetches will be better or getting all the data. I suspect 
former may be better especially when size of table grows. Not a 100% sure 
though. 
{code}
int tmp=0;
while(tmp <= limit)
   get results with PageFilter= limit
   collapse records
   tmp=tmp + number of collpased flow entities in this iteration.
end while
{code}

Additionally, a few other comments.
# In TimelineEntityFilters class javadoc, we should document collapse.
# In javadoc for fromId you mention "The fromId values should be same as fromId 
info field in flow entities. It defines flow entity id.". We do not have a 
fromId field in flow entities. I guess you mean id.
# In TimelineReaderWebServices#getFlows, NumberFormatException can come for 
fromId as well. In handleException we should pass the correct message for this.

> Improve /flows API for more flexible filters fromid, collapse, userid
> ---------------------------------------------------------------------
>
>                 Key: YARN-6027
>                 URL: https://issues.apache.org/jira/browse/YARN-6027
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>              Labels: yarn-5355-merge-blocker
>         Attachments: YARN-6027-YARN-5355.0001.patch
>
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to