[
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769765#comment-13769765
]
Alejandro Abdelnur commented on YARN-1021:
------------------------------------------
Wei, patch looks good, though some NITs that should be taken care of:
* remove LICENSE.txt/NOTICE.txt (Hadoop has those at its root)
* all the files in src/main/resources should not be there but in a
src/main/sample-conf dir else the end up in the JAR and may be picked up
without the user knowing
* the src/test/data dir should be in src/main/data (as it ends up in the distro)
* the test running the simulation runs for 2 mins, do we need 2 mins or can we
make it less? like 1min or 30secs?
* hadoop-tools-dist/pom.xml, hadoop-sls entry, should not have <version>. there
should be an entry for hadoop-sls with the version in the
hadoop-project/pom.xml dependencyManagement section.
* The slsrunner should set fs.defaultFS=file:/// in the conf used to start the
RM, else the RM will attempt to connect to HDFS.
> Yarn Scheduler Load Simulator
> -----------------------------
>
> Key: YARN-1021
> URL: https://issues.apache.org/jira/browse/YARN-1021
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: scheduler
> Reporter: Wei Yan
> Assignee: Wei Yan
> Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz,
> YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
> YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
> YARN-1021.pdf
>
>
> The Yarn Scheduler is a fertile area of interest with different
> implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile,
> several optimizations are also made to improve scheduler performance for
> different scenarios and workload. Each scheduler algorithm has its own set of
> features, and drives scheduling decisions by many factors, such as fairness,
> capacity guarantee, resource availability, etc. It is very important to
> evaluate a scheduler algorithm very well before we deploy it in a production
> cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling
> algorithm. Evaluating in a real cluster is always time and cost consuming,
> and it is also very hard to find a large-enough cluster. Hence, a simulator
> which can predict how well a scheduler algorithm for some specific workload
> would be quite useful.
> We want to build a Scheduler Load Simulator to simulate large-scale Yarn
> clusters and application loads in a single machine. This would be invaluable
> in furthering Yarn by providing a tool for researchers and developers to
> prototype new scheduler features and predict their behavior and performance
> with reasonable amount of confidence, there-by aiding rapid innovation.
> The simulator will exercise the real Yarn ResourceManager removing the
> network factor by simulating NodeManagers and ApplicationMasters via handling
> and dispatching NM/AMs heartbeat events from within the same JVM.
> To keep tracking of scheduler behavior and performance, a scheduler wrapper
> will wrap the real scheduler.
> The simulator will produce real time metrics while executing, including:
> * Resource usages for whole cluster and each queue, which can be utilized to
> configure cluster and queue's capacity.
> * The detailed application execution trace (recorded in relation to simulated
> time), which can be analyzed to understand/validate the scheduler behavior
> (individual jobs turn around time, throughput, fairness, capacity guarantee,
> etc).
> * Several key metrics of scheduler algorithm, such as time cost of each
> scheduler operation (allocate, handle, etc), which can be utilized by Hadoop
> developers to find the code spots and scalability limits.
> The simulator will provide real time charts showing the behavior of the
> scheduler and its performance.
> A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing
> how to use simulator to simulate Fair Scheduler and Capacity Scheduler.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira