[ 
https://issues.apache.org/jira/browse/YARN-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419470#comment-15419470
 ] 

Jason Lowe commented on YARN-5393:
----------------------------------

A lot of this bloat is from over-sleeping.  It's sad to watch the CPU 
utilization during unit tests because it's often atrociously low.  Doing some 
silly manual sampling with jstack while some of the unit tests were running 
found a lot of tests waiting in one of the following methods:
- MockRM.waitForState
- MockAM.waitForState
- RMStateStoreTestBase.waitNotify

I'm sure I missed a lot of other notorious places, but the common theme is 
sleeping for far too long.  One second is a really, really long time for a 
modern CPU, and if tests hammer on methods that do this (and many tests hammer 
waitForState methods) then the seconds quickly start to pile up into minutes.  
I'd love to see most of the sleeps above 10 milliseconds reduced, since I think 
that's where a lot of our runtime is going.

I noticed that YARN-2921 put in a minimum sleep time because reportedly some 
unit tests were failing without it.  That minimum for a few tests is killing 
the performance on a lot of tests.  We need to track down those racy tests and 
fix them rather than forcing a long sleep time for any test that calls those 
waitForState methods.  The longer sleep isn't a real fix for those racy tests 
anyway, rather it just reduces the frequency at which they fail.




> [Umbrella] Optimize YARN tests runtime 
> ---------------------------------------
>
>                 Key: YARN-5393
>                 URL: https://issues.apache.org/jira/browse/YARN-5393
>             Project: Hadoop YARN
>          Issue Type: Test
>            Reporter: Vinod Kumar Vavilapalli
>
> When I originally merged MAPREDUCE-279 into Hadoop, *all* of YARN tests used 
> to take 10 mins with pretty good coverage.
> Now only TestRMRestart takes that much time - we'ven't been that great 
> writing pointed - short tests.
> Time for an initiative to optimize YARN tests. And even after that, if it 
> takes too long, we go the MAPREDUCE-670 route.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to