[
https://issues.apache.org/jira/browse/YARN-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260416#comment-17260416
]
Agshin Kazimli commented on YARN-7200:
--------------------------------------
Hi [~snemeth],
Thanks for the review.
I've investigated the points you have described above. I'd like to point out my
views on the aforementioned scenario.
So, as it is designed, SLS statically takes job informations from the json file
and creates AMs for these jobs right after starting Resource Manager and Node
Managers. SLSRunner.startAM() (org.apache.hadoop.yarn.sls) is invoked to create
AMSimulators from input traces(SLS, RUMEN or SYNTH), add them to amMap, maps
the job ID and corresponding AMSimulators.
*The call hierarchy of AMSimulator creation from SLS trace*
{code:java}
(org.apache.hadoop.yarn.sls)
SLSRunner.startAM()
SLSRunner.startAMFromSLSTrace(String inputTrace)
SLSRunner.createAMForJob(Map jsonJob)
SLSRunner.runNewAM(String jobType, String user,
String jobQueue, String oldJobId, long
jobStartTimeMS,
long jobFinishTimeMS, List<ContainerSimulator>
containerList,
Resource amContainerResource, String labelExpr)
SLSRunner.runNewAM(String jobType, String user,
String jobQueue, String oldJobId, long
jobStartTimeMS,
long jobFinishTimeMS, List<ContainerSimulator>
containerList,
ReservationId reservationId, long deadline, Resource
amContainerResource,
String labelExpr, Map<String, String> params)
{code}
1. SLSRunner.startAM() invokes corresponding functions to create AMs from given
input trace i.e _SLS, RUMEN, SYNTH_
2. SLSRunner.startAMFromSLSTrace() reads the input trace(json file) and invokes
SLSRunner.createAMForJob() for every job
3. SLSRunner.createAMForJob() takes the map of jsonJobs and for the given job
count, invokes SLSRunner.runNewAM()
4. SLSRunner.runNewAM() is called, there are 3 different SLSRunner.runNewAM()
functions out there, because _SLS, RUMEN, SYNTH_ traces differ a little bit.
One of the functions is the base, which is invoked on the other
SLSRunner.runNewAM() functions.
5. In SLSRunner.runNewAM(), AMSimulator is initialized with the given
parameters which gets heartbeatInterval argument and creates the AMSimulator.
Then, new entry is added to amMap with the (jobID, amSim).
At the end of SLSRunner.startAM(), remainingApps is assigned to numAMs, which
is equal to amMap.size() at the end of startAM():
{code:java}
numAMs = amMap.size();
remainingApps = numAMs;
{code}
My conclusion is that, as you see, creation of AMs is not bound to any other
thread, they are automatically created with the static info, mapping job id and
amsimulator, and assigning remainingApps to the size of this map. To support my
argument, I've added some LOG info to see whether they are created and added to
the map instantaneously. As it is expected, it turns out that way. In the
scenario, which you've mentioned AMSimulators can have different
heartBeatInterval and starting time, but it doesn't happen in the same process,
as I've described above, SLSRunner.runNewAM() initializes AMSimulators, which
in turn extended from TaskRunner.task, itself implements Runnable interface.
But, mapping of these AMSimulators are happening on the same thread.
> SLS generates a realtimetrack.json file but that file is missing the closing
> ']'
> --------------------------------------------------------------------------------
>
> Key: YARN-7200
> URL: https://issues.apache.org/jira/browse/YARN-7200
> Project: Hadoop YARN
> Issue Type: Bug
> Components: scheduler-load-simulator
> Reporter: Grant Sohn
> Assignee: Agshin Kazimli
> Priority: Minor
> Labels: newbie, newbie++
> Attachments: YARN-7200-branch-trunk.patch, YARN-7200.002.patch,
> YARN-7200.003.patch, snemeth-testing-20201113.zip
>
>
> File
> hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SchedulerMetrics.java
> shows:
> {noformat}
> void tearDown() throws Exception {
> if (metricsLogBW != null) {
> metricsLogBW.write("]");
> metricsLogBW.close();
> }
> ....
> {noformat}
> So the exit logic is flawed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]