[ https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857803#comment-15857803 ]
Varun Saxena commented on YARN-6027: ------------------------------------ bq. But it is expected to collapse with date range Ok, then its fine. I was thinking we would try to display all the flows (say 10 on each page) on UI. If it is based on daterange then it should be fine in terms of performance. I guess we will probably display flows for the current day only. We can probably leave a note in javadoc that a suitable daterange should be provided in general for this REST endpoint. bq. User can directly provide flow entity ID as fromId. Ohh you are providing ID itself. Maybe we would like to leave a note in javadoc and documentation that cluster part of it will be ignored. And in case of collapse, cluster and timestamp will be ignored. In UI case, cluster would be same as the one in REST endpoint but you can form fromID manually as well and provide a different cluster ID than the one in REST URL path param in that case. So we can make the behavior clear. bq. If need to parse the errors, then why flow entity id is providing full row key as id? I think need to change flow entity id format itself. That is just for read. We do not make any decisions with it. But now we will. We can encode or escape cluster and other stuff while creating ID in FlowActivityEntity itself but when UI displays it, it may have to unescape it. Also we would need to unescape it after splitting fromId. Changing format wont make much difference as some delimiter or the other will have to be used and that will have to be escaped too. Right? Cluster ID is a plain string and we have to assume it can be anything. This would have to be done just to make the system more robust even if we are unlikely to have a certain delimiter in cluster or elsewhere. bq. One optimization I can do is PageFilter can be applied in non-collapse mode Yeah that can be done. bq. If you look at the patch, I have removed PageFilter while scanning which gives all the data. Ok...Cant we apply PageFilter in steps in collapse mode? Maybe override getResults itself. When we use it with daterange it should be fine but in cases where daterange is not specified, this may help. What I mean is get results from backend with PageFilter equivalent to limit. Then collapse and go back and fetch results again if more records are required(based on limit). Something like below. We need to check with however, if PageFilter, with limited but possible multiple fetches will be better or getting all the data. I suspect former may be better especially when size of table grows. Not a 100% sure though. {code} int tmp=0; while(tmp <= limit) get results with PageFilter= limit collapse records tmp=tmp + number of collpased flow entities in this iteration. end while {code} Additionally, a few other comments. # In TimelineEntityFilters class javadoc, we should document collapse. # In javadoc for fromId you mention "The fromId values should be same as fromId info field in flow entities. It defines flow entity id.". We do not have a fromId field in flow entities. I guess you mean id. # In TimelineReaderWebServices#getFlows, NumberFormatException can come for fromId as well. In handleException we should pass the correct message for this. > Improve /flows API for more flexible filters fromid, collapse, userid > --------------------------------------------------------------------- > > Key: YARN-6027 > URL: https://issues.apache.org/jira/browse/YARN-6027 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > Attachments: YARN-6027-YARN-5355.0001.patch > > > In YARN-5585 , fromId is supported for retrieving entities. We need similar > filter for flows/flowRun apps and flow run and flow as well. > Along with supporting fromId, this JIRA should also discuss following points > * Should we throw an exception for entities/entity retrieval if duplicates > found? > * TimelieEntity : > ** Should equals method also check for idPrefix? > ** Does idPrefix is part of identifiers? -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org