[
https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237866#comment-17237866
]
Drew Merrill commented on YARN-10427:
-------------------------------------
Yes it does. Output attached. [^jobruntime.csv]
> Duplicate Job IDs in SLS output
> -------------------------------
>
> Key: YARN-10427
> URL: https://issues.apache.org/jira/browse/YARN-10427
> Project: Hadoop YARN
> Issue Type: Bug
> Components: scheduler-load-simulator
> Affects Versions: 3.0.0, 3.3.0, 3.2.1, 3.4.0
> Environment: I ran the attached inputs on my MacBook Pro, using
> Hadoop compiled from the latest trunk (as of commit 139a43e98e). I also
> tested against 3.2.1 and 3.3.0 release branches.
>
> Reporter: Drew Merrill
> Priority: Major
> Attachments: fair-scheduler.xml, inputsls.json, jobruntime.csv,
> jobruntime.csv, jobruntime.csv, mapred-site.xml, sls-runner.xml, yarn-site.xml
>
>
> Hello, I'm hoping someone can help me resolve or understand some issues I've
> been having with the YARN Scheduler Load Simulator (SLS). I've been
> experimenting with SLS for several months now at work as we're trying to
> build a simulation model to characterize our enterprise Hadoop infrastructure
> for purposes of future capacity planning. In the process of attempting to
> verify and validate the SLS output, I've encountered a number of issues
> including runtime exceptions and bad output. The focus of this issue is the
> bad output. In all my simulation runs, the jobruntime.csv output seems to
> have one or more of the following problems: no output, duplicate job ids,
> and/or missing job ids.
>
> Because of where I work, I'm unable to provide the exact inputs I typically
> use, but I'm able to reproduce the problem of the duplicate Job IDS using
> some simplified inputs and configuration files, which I've attached, along
> with the output I obtained.
>
> The command I used to run the simulation:
> {{./runsls.sh --tracetype=SLS --tracelocation=./inputsls.json
> --output-dir=sls-run-1 --print-simulation
> --track-jobs=job_1,job_2,job_3,job_4,job_5,job_6,job_7,job_8,job_9,job_10}}
>
> Can anyone help me understand what would cause the duplicate Job IDs in the
> output? Is this a bug in Hadoop or a problem with my inputs? Thanks in
> advance.
>
> PS: This is my first issue I've ever opened so please be kind if I've missed
> something or am not understanding something obvious about the way Hadoop
> works. I'll gladly follow-up with more info as requested.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]