[
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281276#comment-14281276
]
Fabio Colzada commented on YARN-1021:
-------------------------------------
Hi, I am working with sls on Hadoop 2.6.0, I really need it but I'm struggling
to have it running at its best. Simulation completes on command line and I can
see the log entries on the screen, but:
1- the web interface is not working. On screen I can see this exception more or
less at the beginning of the workflow:
java.lang.NullPointerException
at org.apache.hadoop.yarn.sls.web.SLSWebApp.<init>(SLSWebApp.java:86)
at
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.initMetrics(ResourceSchedulerWrapper.java:477)
at
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.setConf(ResourceSchedulerWrapper.java:176)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createScheduler
(ResourceManager.java:291)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit
(ResourceManager.java:484)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices
(ResourceManager.java:989)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit
(ResourceManager.java:255)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:167)
at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:141)
at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:528)
Not sure which object is null, but I see the folder sls/html has the expected
files.
2- I don't get the files realtimetrack.json nor jobruntime.csv, while metrics
folder is correctly populated. I see some recurring exceptions, I don't know if
they are related since they don't prevent the simulation to terminate:
java.lang.NullPointerException
at
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addAMRuntime
(ResourceSchedulerWrapper.java:735)
at
org.apache.hadoop.yarn.sls.appmaster.AMSimulator.lastStep(AMSimulator.java:193)
at
org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.lastStep(MRAMSimulator.java:396)
at
org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:100)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
and also
Exception in thread "pool-5-thread-374" java.lang.NullPointerException
at
org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:104)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Any help is really appreciated
> Yarn Scheduler Load Simulator
> -----------------------------
>
> Key: YARN-1021
> URL: https://issues.apache.org/jira/browse/YARN-1021
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: scheduler
> Reporter: Wei Yan
> Assignee: Wei Yan
> Fix For: 2.3.0
>
> Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz,
> YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
> YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
> YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
> YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf
>
>
> The Yarn Scheduler is a fertile area of interest with different
> implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile,
> several optimizations are also made to improve scheduler performance for
> different scenarios and workload. Each scheduler algorithm has its own set of
> features, and drives scheduling decisions by many factors, such as fairness,
> capacity guarantee, resource availability, etc. It is very important to
> evaluate a scheduler algorithm very well before we deploy it in a production
> cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling
> algorithm. Evaluating in a real cluster is always time and cost consuming,
> and it is also very hard to find a large-enough cluster. Hence, a simulator
> which can predict how well a scheduler algorithm for some specific workload
> would be quite useful.
> We want to build a Scheduler Load Simulator to simulate large-scale Yarn
> clusters and application loads in a single machine. This would be invaluable
> in furthering Yarn by providing a tool for researchers and developers to
> prototype new scheduler features and predict their behavior and performance
> with reasonable amount of confidence, there-by aiding rapid innovation.
> The simulator will exercise the real Yarn ResourceManager removing the
> network factor by simulating NodeManagers and ApplicationMasters via handling
> and dispatching NM/AMs heartbeat events from within the same JVM.
> To keep tracking of scheduler behavior and performance, a scheduler wrapper
> will wrap the real scheduler.
> The simulator will produce real time metrics while executing, including:
> * Resource usages for whole cluster and each queue, which can be utilized to
> configure cluster and queue's capacity.
> * The detailed application execution trace (recorded in relation to simulated
> time), which can be analyzed to understand/validate the scheduler behavior
> (individual jobs turn around time, throughput, fairness, capacity guarantee,
> etc).
> * Several key metrics of scheduler algorithm, such as time cost of each
> scheduler operation (allocate, handle, etc), which can be utilized by Hadoop
> developers to find the code spots and scalability limits.
> The simulator will provide real time charts showing the behavior of the
> scheduler and its performance.
> A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing
> how to use simulator to simulate Fair Scheduler and Capacity Scheduler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)