[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739853#comment-13739853 ]
Robert Joseph Evans commented on YARN-1024: ------------------------------------------- {quote}Sorry for the longwindedness.{quote} >From what people have told me you still have a long ways to go before you >approach me for longwindedness :). My initial gut reaction is that only having two numbers to express the request seems too simplified, but the more I think about it the more I am OK with it, although I think I would change the numbers to be total YCUs requested and minimum YCUs per core. This gives the user better viability into how the scheduler is treating these numbers so they can better reason about them. The total YCUs is the value used for scheduling. The minimum YCUs per core is compared to the maxComputeUnitsPerCore like was suggested to reject a request as not possible, or in the case of a heterogeneous environment restrict the hosts that this container can run on. Although I am OK with the original proposal too. I would also like us to have a flag that would either limit the container to the requested CPU and let it have no more even when more is available, or would let it expand to use whatever CPU was free, but would be guaranteed to get at least the YCUs requested. This is likely something that would have to be done on a separate JIRA though. Without this I don't see a way to really get simplicity, predictability, or consistency. 1 MB of RAM is fairly simple to understand. It can be measured without too much of a problem just by running the process. Most user do a simple search for the correct value run with the default, if it does not work I increase the amount and run again. 1 YCU is very complex to measure for an application. If I cannot restrict a container to never use more than what was requested I cannot consistently predict how long it will take to run later. Without this I don't know how to answer the question I know will come up. What should I set these values to? > Define a virtual core unambigiously > ----------------------------------- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Arun C Murthy > Assignee: Arun C Murthy > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira