[ https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125908#comment-14125908 ]
Sangjin Lee commented on YARN-1530: ----------------------------------- {quote} The bottleneck is still there. Essentially I don’t see any difference between publishing entities via HTTP REST interface and via HDFS in terms of scalability. {quote} IMO, option (1) necessarily entails less frequent imports into the store by ATS. Obviously, if ATS still imports the HDFS files at the same speed as the timeline entries are generated, there would be no difference in scalability. This option would make sense only if the imports are less frequent. It also would mean that as a trade-off reads would be more stale. I believe Robert's document points out all those points. Regarding option (2), I think your point is valid that it would be a transition from a thin client to a fat client. And along with that would be some complications as you point out. However, I'm not too sure if it would make changing the data store much more complicated than other scenarios. I think the main problem of switching the data store is when not all writers are updated to point to the new data store. If writes are in progress, and the clients are being upgraded, there would be some inconsistencies between clients that were already upgraded and started writing to the new store and those that are not upgraded yet and still writing to the old store. If you have a single writer (such as the current ATS design), then it would be simpler. But then again, if we consider a scenario such as a cluster of ATS instances, the same problem exists there. I think that specific problem could be solved by holding the writes in some sort of a backup area (e.g. hdfs) before the switch starts, and recovering/re-enabling once all the writers are upgraded. The idea of a cluster of ATS instances (multiple write/read instances) sounds interesting. It might be able to address the scalability/reliability problem at hand. We'd need to think through and poke holes to see if the idea holds up well, however. It would need to address how load balancing would be done and whether it would be left up to the user, for example. > [Umbrella] Store, manage and serve per-framework application-timeline data > -------------------------------------------------------------------------- > > Key: YARN-1530 > URL: https://issues.apache.org/jira/browse/YARN-1530 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Vinod Kumar Vavilapalli > Attachments: ATS-Write-Pipeline-Design-Proposal.pdf, > ATS-meet-up-8-28-2014-notes.pdf, application timeline design-20140108.pdf, > application timeline design-20140116.pdf, application timeline > design-20140130.pdf, application timeline design-20140210.pdf > > > This is a sibling JIRA for YARN-321. > Today, each application/framework has to do store, and serve per-framework > data all by itself as YARN doesn't have a common solution. This JIRA attempts > to solve the storage, management and serving of per-framework data from > various applications, both running and finished. The aim is to change YARN to > collect and store data in a generic manner with plugin points for frameworks > to do their own thing w.r.t interpretation and serving. -- This message was sent by Atlassian JIRA (v6.3.4#6332)