[ 
https://issues.apache.org/jira/browse/YARN-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2670:
-----------------------------
    Target Version/s:   (was: 2.8.0)

> Adding feedback capability to capacity scheduler from external systems
> ----------------------------------------------------------------------
>
>                 Key: YARN-2670
>                 URL: https://issues.apache.org/jira/browse/YARN-2670
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>
> The sheer growth in data volume and Hadoop cluster size make it a significant 
> challenge to diagnose and locate problems in a production-level cluster 
> environment efficiently and within a short period of time. Often times, the 
> distributed monitoring systems are not capable of detecting a problem well in 
> advance when a large-scale Hadoop cluster starts to deteriorate in 
> performance or becomes unavailable. Thus, incoming workloads, scheduled 
> between the time when cluster starts to deteriorate and the time when the 
> problem is identified, suffer from longer execution times. As a result, both 
> reliability and throughput of the cluster reduce significantly. we address 
> this problem by proposing a system called Astro, which consists of a 
> predictive model and an extension to the Capacity scheduler. The predictive 
> model in Astro takes into account a rich set of cluster behavioral 
> information that are collected by monitoring processes and model them using 
> machine learning algorithms to predict future behavior of the cluster. The 
> Astro predictive model detects anomalies in the cluster and also identifies a 
> ranked set of metrics that have contributed the most towards the problem. The 
> Astro scheduler uses the prediction outcome and the list of metrics to decide 
> whether it needs to move and reduce workloads from the problematic cluster 
> nodes or to prevent additional workload allocations to them, in order to 
> improve both throughput and reliability of the cluster.
> This JIRA is only for adding feedback capabilities to Capacity Scheduler 
> which can take feedback from external systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to