[
https://issues.apache.org/jira/browse/YARN-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034834#comment-16034834
]
Daryn Sharp commented on YARN-6680:
-----------------------------------
I use a profiler for performance work – hunches are inevitable wrong. [~jlowe]
and [~nroberts] verified the improvement. 2.8 is current DOA, see
[details|https://issues.apache.org/jira/browse/YARN-6679?focusedCommentId=16033655&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16033655].
This patch, along with my others under the umbrella, increased overall
performance by ~2X.
The scheduler's fine grain locking was a bad idea. RW locks are not cheap esp.
for tiny critical sections. Write barriers are extremely expensive - slower
than a hash lookup. Surprising but true. Eventually these maps should be
concurrent maps which uses no lock for read ops, and the memory read barriers
are cheap. The processor just sniffs the cache lines.
bq. i feel the locks atleast in few places are required to maintain
consistency. [...] read lock is required as intermittently node's partition
mapping could be changed, or node can be deactivated etc... all the ops where
write lock is held ?
The locks currently do not provide guaranteed consistency. Example:
# Consistent:
#* thread1 read locks, gets resource, unlocks
#* thread2 write locks, updates resource
#* thread1 accesses resource – won't see thread2 update immediately
# Inconsistent:
#* thread1 write locks, updates resource
#* thread2 read locks, gets resource, unlocks, accesses resource – will see
thread1 update
With my patch, the reader won't see the update in either case (unless it was
also the writer). The question is does it matter? It's already a race due to
no coarse grain lock to provide a snapshot view in time. Will it have
detrimental impact to possibly see slightly stale data? If it does then
there's already a major bug in code.
In the end, this patch contributed to making 2.8 actually deployable.
> Avoid locking overhead for NO_LABEL lookups
> -------------------------------------------
>
> Key: YARN-6680
> URL: https://issues.apache.org/jira/browse/YARN-6680
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 2.8.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Attachments: YARN-6680.patch
>
>
> Labels are managed via a hash that is protected with a read lock. The lock
> acquire and release are each just as expensive as the hash lookup itself -
> resulting in a 3X slowdown.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]