[ 
https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125908#comment-14125908
 ] 

Sangjin Lee commented on YARN-1530:
-----------------------------------

{quote}
The bottleneck is still there. Essentially I don’t see any difference between 
publishing entities via HTTP REST interface and via HDFS in terms of 
scalability.
{quote}

IMO, option (1) necessarily entails less frequent imports into the store by 
ATS. Obviously, if ATS still imports the HDFS files at the same speed as the 
timeline entries are generated, there would be no difference in scalability. 
This option would make sense only if the imports are less frequent. It also 
would mean that as a trade-off reads would be more stale. I believe Robert's 
document points out all those points.

Regarding option (2), I think your point is valid that it would be a transition 
from a thin client to a fat client. And along with that would be some 
complications as you point out.

However, I'm not too sure if it would make changing the data store much more 
complicated than other scenarios. I think the main problem of switching the 
data store is when not all writers are updated to point to the new data store. 
If writes are in progress, and the clients are being upgraded, there would be 
some inconsistencies between clients that were already upgraded and started 
writing to the new store and those that are not upgraded yet and still writing 
to the old store. If you have a single writer (such as the current ATS design), 
then it would be simpler. But then again, if we consider a scenario such as a 
cluster of ATS instances, the same problem exists there. I think that specific 
problem could be solved by holding the writes in some sort of a backup area 
(e.g. hdfs) before the switch starts, and recovering/re-enabling once all the 
writers are upgraded.

The idea of a cluster of ATS instances (multiple write/read instances) sounds 
interesting. It might be able to address the scalability/reliability problem at 
hand. We'd need to think through and poke holes to see if the idea holds up 
well, however. It would need to address how load balancing would be done and 
whether it would be left up to the user, for example.

> [Umbrella] Store, manage and serve per-framework application-timeline data
> --------------------------------------------------------------------------
>
>                 Key: YARN-1530
>                 URL: https://issues.apache.org/jira/browse/YARN-1530
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>         Attachments: ATS-Write-Pipeline-Design-Proposal.pdf, 
> ATS-meet-up-8-28-2014-notes.pdf, application timeline design-20140108.pdf, 
> application timeline design-20140116.pdf, application timeline 
> design-20140130.pdf, application timeline design-20140210.pdf
>
>
> This is a sibling JIRA for YARN-321.
> Today, each application/framework has to do store, and serve per-framework 
> data all by itself as YARN doesn't have a common solution. This JIRA attempts 
> to solve the storage, management and serving of per-framework data from 
> various applications, both running and finished. The aim is to change YARN to 
> collect and store data in a generic manner with plugin points for frameworks 
> to do their own thing w.r.t interpretation and serving.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to