[ 
https://issues.apache.org/jira/browse/YARN-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260416#comment-17260416
 ] 

Agshin Kazimli commented on YARN-7200:
--------------------------------------

Hi [~snemeth],
 Thanks for the review.

I've investigated the points you have described above. I'd like to point out my 
views on the aforementioned scenario.

So, as it is designed, SLS statically takes job informations from the json file 
and creates AMs for these jobs right after starting Resource Manager and Node 
Managers. SLSRunner.startAM() (org.apache.hadoop.yarn.sls) is invoked to create 
AMSimulators from input traces(SLS, RUMEN or SYNTH), add them to amMap, maps 
the job ID and corresponding AMSimulators.

*The call hierarchy of AMSimulator creation from SLS trace*
{code:java}
(org.apache.hadoop.yarn.sls)

SLSRunner.startAM()
   SLSRunner.startAMFromSLSTrace(String inputTrace)
      SLSRunner.createAMForJob(Map jsonJob)
         SLSRunner.runNewAM(String jobType, String user,
                           String jobQueue, String oldJobId, long 
jobStartTimeMS,
                           long jobFinishTimeMS, List<ContainerSimulator> 
containerList,
                           Resource amContainerResource, String labelExpr)
            SLSRunner.runNewAM(String jobType, String user,
                           String jobQueue, String oldJobId, long 
jobStartTimeMS,
                           long jobFinishTimeMS, List<ContainerSimulator> 
containerList,
                           ReservationId reservationId, long deadline, Resource 
amContainerResource,
                           String labelExpr, Map<String, String> params)

{code}

1. SLSRunner.startAM() invokes corresponding functions to create AMs from given 
input trace i.e _SLS, RUMEN, SYNTH_
2. SLSRunner.startAMFromSLSTrace() reads the input trace(json file) and invokes 
SLSRunner.createAMForJob() for every job
3. SLSRunner.createAMForJob() takes the map of jsonJobs and for the given job 
count, invokes SLSRunner.runNewAM()
4. SLSRunner.runNewAM() is called, there are 3 different SLSRunner.runNewAM() 
functions out there, because  _SLS, RUMEN, SYNTH_ traces differ a little bit. 
One of the functions is the base, which is invoked on the other 
SLSRunner.runNewAM() functions.
5. In SLSRunner.runNewAM(), AMSimulator is initialized with the given 
parameters which gets heartbeatInterval argument and creates the AMSimulator. 
Then, new entry is added to amMap with the (jobID, amSim).

At the end of SLSRunner.startAM(), remainingApps is assigned to numAMs, which 
is equal to amMap.size() at the end of startAM():

{code:java}
numAMs = amMap.size();
remainingApps = numAMs;
{code}

My conclusion is that, as you see, creation of AMs is not bound to any other 
thread, they are automatically created with the static info, mapping job id and 
amsimulator, and assigning remainingApps to the size of this map. To support my 
argument, I've added some LOG info to see whether they are created and added to 
the map instantaneously. As it is expected, it turns out that way. In the 
scenario, which you've mentioned AMSimulators can have different 
heartBeatInterval and starting time, but it doesn't happen in the same process, 
as I've described above, SLSRunner.runNewAM() initializes AMSimulators, which 
in turn extended from TaskRunner.task, itself implements Runnable interface. 
But, mapping of these AMSimulators are happening on the same thread.


> SLS generates a realtimetrack.json file but that file is missing the closing 
> ']'
> --------------------------------------------------------------------------------
>
>                 Key: YARN-7200
>                 URL: https://issues.apache.org/jira/browse/YARN-7200
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler-load-simulator
>            Reporter: Grant Sohn
>            Assignee: Agshin Kazimli
>            Priority: Minor
>              Labels: newbie, newbie++
>         Attachments: YARN-7200-branch-trunk.patch, YARN-7200.002.patch, 
> YARN-7200.003.patch, snemeth-testing-20201113.zip
>
>
> File 
> hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SchedulerMetrics.java
>  shows:
> {noformat}
>   void tearDown() throws Exception {
>     if (metricsLogBW != null)  {
>       metricsLogBW.write("]");
>       metricsLogBW.close();
>     }
>     ....
> {noformat}
> So the exit logic is flawed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to