[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730195#comment-13730195
 ] 

Sandy Ryza commented on YARN-1024:
----------------------------------

Jason, Steve, and Arun, you bring up good points that I think have helped me 
understand some of my assumptions.   I agree that simplicity, predictability, 
and consistency are our most important requirements.  I agree with Jason that 
at least two values -  processing power per core and # of cores - are required 
to fully express a request, and that, in spite of this, we should not use both 
and that a single value is better than nothing.

We have a tradeoff between
* A definition that offers some predictability between clusters, but only makes 
sense for requests for a single physical core or less per container.
* A definition that offers predictability only on homogeneous hardware, but 
that functions sensibly for requests for both more and less than a single 
physical core.

I thought that one of the exciting things about allowing requests for CPU would 
be that YARN would be able to better accommodate multi-threaded CPU-intensive 
frameworks like MPI and Storm.  Predictability between clusters seems to matter 
a lot less to me. A ton of other factors interfere with this kind of 
predictability.  The speed that hardware permits a task to read from disk or 
over the network has can have just as large an impact on the processing power 
it consumes as whatever the task is doing.  I don't believe that we will be 
able to attain predictability to the degree that it will provide much value.

                
> Define a virtual core unambigiously
> -----------------------------------
>
>                 Key: YARN-1024
>                 URL: https://issues.apache.org/jira/browse/YARN-1024
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to