[ 
https://issues.apache.org/jira/browse/YARN-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034834#comment-16034834
 ] 

Daryn Sharp commented on YARN-6680:
-----------------------------------

I use a profiler for performance work – hunches are inevitable wrong.  [~jlowe] 
and [~nroberts] verified the improvement.  2.8 is current DOA, see 
[details|https://issues.apache.org/jira/browse/YARN-6679?focusedCommentId=16033655&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16033655].
  This patch, along with my others under the umbrella, increased overall 
performance by ~2X.  

The scheduler's fine grain locking was a bad idea.  RW locks are not cheap esp. 
for tiny critical sections.  Write barriers are extremely expensive - slower 
than a hash lookup.  Surprising but true.  Eventually these maps should be 
concurrent maps which uses no lock for read ops, and the memory read barriers 
are cheap.  The processor just sniffs the cache lines.

bq. i feel the locks atleast in few places are required to maintain 
consistency. [...] read lock is required as intermittently node's partition 
mapping could be changed, or node can be deactivated etc... all the ops where 
write lock is held ?

The locks currently do not provide guaranteed consistency.  Example:
# Consistent:
#* thread1 read locks, gets resource, unlocks
#* thread2 write locks, updates resource
#* thread1 accesses resource – won't see thread2 update immediately
# Inconsistent:
#* thread1 write locks, updates resource
#* thread2 read locks, gets resource, unlocks, accesses resource – will see 
thread1 update

With my patch, the reader won't see the update in either case (unless it was 
also the writer).  The question is does it matter?  It's already a race due to 
no coarse grain lock to provide a snapshot view in time.  Will it have 
detrimental impact to possibly see slightly stale data?  If it does then 
there's already a major bug in code.

In the end, this patch contributed to making 2.8 actually deployable.

> Avoid locking overhead for NO_LABEL lookups
> -------------------------------------------
>
>                 Key: YARN-6680
>                 URL: https://issues.apache.org/jira/browse/YARN-6680
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: YARN-6680.patch
>
>
> Labels are managed via a hash that is protected with a read lock.  The lock 
> acquire and release are each just as expensive as the hash lookup itself - 
> resulting in a 3X slowdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to