[
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730195#comment-13730195
]
Sandy Ryza commented on YARN-1024:
----------------------------------
Jason, Steve, and Arun, you bring up good points that I think have helped me
understand some of my assumptions. I agree that simplicity, predictability,
and consistency are our most important requirements. I agree with Jason that
at least two values - processing power per core and # of cores - are required
to fully express a request, and that, in spite of this, we should not use both
and that a single value is better than nothing.
We have a tradeoff between
* A definition that offers some predictability between clusters, but only makes
sense for requests for a single physical core or less per container.
* A definition that offers predictability only on homogeneous hardware, but
that functions sensibly for requests for both more and less than a single
physical core.
I thought that one of the exciting things about allowing requests for CPU would
be that YARN would be able to better accommodate multi-threaded CPU-intensive
frameworks like MPI and Storm. Predictability between clusters seems to matter
a lot less to me. A ton of other factors interfere with this kind of
predictability. The speed that hardware permits a task to read from disk or
over the network has can have just as large an impact on the processing power
it consumes as whatever the task is doing. I don't believe that we will be
able to attain predictability to the degree that it will provide much value.
> Define a virtual core unambigiously
> -----------------------------------
>
> Key: YARN-1024
> URL: https://issues.apache.org/jira/browse/YARN-1024
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
>
> We need to clearly define the meaning of a virtual core unambiguously so that
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU:
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira