[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739853#comment-13739853
 ] 

Robert Joseph Evans commented on YARN-1024:
-------------------------------------------

{quote}Sorry for the longwindedness.{quote}

>From what people have told me you still have a long ways to go before you 
>approach me for longwindedness :).

My initial gut reaction is that only having two numbers to express the request 
seems too simplified, but the more I think about it the more I am OK with it, 
although I think I would change the numbers to be total YCUs requested and 
minimum YCUs per core.  This gives the user better viability into how the 
scheduler is treating these numbers so they can better reason about them. The 
total YCUs is the value used for scheduling.  The minimum YCUs per core is 
compared to the maxComputeUnitsPerCore like was suggested to reject a request 
as not possible, or in the case of a heterogeneous environment restrict the 
hosts that this container can run on.  Although I am OK with the original 
proposal too.

I would also like us to have a flag that would either limit the container to 
the requested CPU and let it have no more even when more is available, or would 
let it expand to use whatever CPU was free, but would be guaranteed to get at 
least the YCUs requested.  This is likely something that would have to be done 
on a separate JIRA though.  Without this I don't see a way to really get 
simplicity, predictability, or consistency.  1 MB of RAM is fairly simple to 
understand.  It can be measured without too much of a problem just by running 
the process.  Most user do a simple search for the correct value run with the 
default, if it does not work I increase the amount and run again.  1 YCU is 
very complex to measure for an application.  If I cannot restrict a container 
to never use more than what was requested I cannot consistently predict how 
long it will take to run later.  Without this I don't know how to answer the 
question I know will come up.

What should I set these values to?

                
> Define a virtual core unambigiously
> -----------------------------------
>
>                 Key: YARN-1024
>                 URL: https://issues.apache.org/jira/browse/YARN-1024
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to