[
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
He Tianyi reassigned YARN-5479:
-------------------------------
Assignee: He Tianyi
> FairScheduler: Scheduling performance improvement
> -------------------------------------------------
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: fairscheduler, resourcemanager
> Affects Versions: 2.6.0
> Reporter: He Tianyi
> Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for
> scheduling. As number of nodes grows, more events need to be processed in
> time in FairScheduler. Also, increased number of applications & queues slows
> down processing of each single event.
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds.
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of
> some nodes cannot get processed before next heartbeat. This brings
> inefficiency handling burst requests (i.e. newly submitted MapReduce
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single
> cluster is scaled to several K of nodes (even with {{assignmultiple}}
> enabled).
> This issue proposes to perform several optimization towards performance in
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped
> (or should this be called 'delayed sorting'?). we can either start another
> dedicated thread to do the sorting & updating, or actually perform sorting
> after current result have been used several times (say sort once in every 100
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time
> (using 'immutable' apis). the overhead can be eliminated with a
> light-weighted implementation of Resource, which do not instantiate a builder
> until necessary, because most instances are used as intermediate result in
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is
> using reflection, which can be replaced by a plain {{new}} (for scheduler
> usage only). furthermore, perhaps we could 'intern' resource to avoid
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to
> {{update}}, making root queue metrics eventual consistent (which may
> satisfies most of the needs). or introduce counters to {{getResourceUsage}}
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]