arm: support big.little SoC

Peng Fan Wed, 21 Sep 2016 23:18:07 -0700

On Wed, Sep 21, 2016 at 08:28:32PM +0100, Julien Grall wrote:
>Hi Dario,
>
>On 21/09/2016 16:45, Dario Faggioli wrote:
>>On Wed, 2016-09-21 at 14:06 +0100, Julien Grall wrote:
>>>(CC a couple of ARM folks)
>>>
>>Yay, thanks for this! :-)
>>
>>>I had few discussions and  more thought about big.LITTLE support in
>>>Xen.
>>>The main goal of big.LITTLE is power efficiency by moving task
>>>around
>>>and been able to idle one cluster. All the solutions suggested
>>>(including mine) so far, can be replicated by hand (except the VPIDR)
>>>so
>>>they are mostly an automatic way.
>>>
>>I'm sorry, how is this (going to be) handled in Linux? Is it that any
>>arbitrary task executing any arbitrary binary code can be run on both
>>big and LITTLE pcpus, depending on the scheduler's and energy
>>management's decisions?
>>
>>This does not seem to match with what has been said at some point in
>>this thread... And if it's like that, how's that possible, if the
>>pcpus' ISAs are (even only slightly) different?
>
>Right, at some point I mentioned that the set of errata and features will be
>different between processor.
>
>However, it is possible to sanitize the feature registers to expose a common
>set to the guest. This is what is done in Linux at boot time, only the
>features common to all the CPUs will be enabled.
>
>This allows a task to migrate between big and LITTLE CPUs seamlessly.
>
>>
>>>This will also remove the real
>>>benefits of big.LITTLE because Xen will not be able to migrate vCPU
>>>across cluster for power efficiency.
>>>
>>>If we care about power efficiency, we would have to handle
>>>seamlessly
>>>big.LITTLE in Xen (i.e a guess would only see a kind of CPU).
>>>
>>Well, I'm a big fan of an approach that leaves the guests' scheduler
>>dumb about things like these (i.e., load balancing, energy efficiency,
>>etc), and hence puts Xen in charge. In fact, on a Xen system, it is
>>only Xen that has all the info necessary to make wise decisions (e.g.,
>>the load of the _whole_ host, the effect of any decisions on the
>>_whole_ host, etc).
>>
>>But this case may be a LITTLE.bit ( :-PP ) different.
>>
>>Anyway, I guess I'll way your reply to my question above before
>>commenting more.
>>
>>>This arise
>>>quite few problem, nothing insurmountable, similar to migration
>>>across
>>>two platforms with different micro-architecture (e.g processors):
>>>errata, features supported... The guest would have to know the union
>>>of
>>>all the errata (this is done so far via the MIDR, so we would a PV
>>>way
>>>to do it), and only the intersection of features would be exposed to
>>>the
>>>guest. This also means the scheduler would have to be modified to
>>>handle
>>>power efficiency (not strictly necessary at the beginning).
>>>
>>>I agree that a such solution would require some work to implement,
>>>although Xen will have a better control of the energy consumption of
>>>the
>>>platform.
>>>
>>>So the question here, is what do we want to achieve with big.LITTLE?
>>>
>>Just thinking out loud here. So, instead of "just", as George
>>suggested:
>>
>> vcpuclass=["0-1:A35","2-5:A53", "6-7:A72"]
>>
>>we can allow something like the following (note that I'm tossing out
>>random numbers next to the 'A's):
>>
>> vcpuclass = ["0-1:A35", "2-5:A53,A17", "6-7:A72,A24,A31", "12-13:A8"]
>>
>>with the following meaning:
>> - vcpus 0, 1 can only run on pcpus of class A35
>> - vcpus 2,3,4,5 can run on pcpus of class A53 _and_ on pcpus of class
>>   A17
>> - vcpus 6,7 can run on pcpus of class A72, A24, A31
>> - vcpus 8,9,10,11 --since they're not mentioned, can run on pcpus of
>>   any class
>> - vcpus 12,13 can only run on pcpus of class A8
>>
>>This will set the "boundaries", for each vcpu. Then, within these
>>boundaries, once in the (Xen's) scheduler, we can implement whatever
>>complex/magic/silly logic we want, e.g.:
>> - only use a pcpu of class A53 for vcpus that have an average load
>>   above 50%
>> - only use a pcpu of class A31 if there are no idle pcpus of class A24
>> - only use a pcpu of class A17 for a vcpu if the total system load
>>   divided by the vcpu ID give 42 as result
>> - whatever
>>
>>This allows us to achieve both the following goals:
>> - allow Xen to take smart decisions, considering the load and the
>>   efficiency of the host as a whole
>> - allow the guest to take smart decisions, like running lightweight
>>   tasks on low power vcpus (which then Xen will run on low
>>   power pcpus, at least on a properly configured system)
>>
>>Of course this **requires** that, for instance, vcpu 6 must be able to
>>run on A72, A24 and A31 just fine, i.e., it must be possible for it to
>>block on I/O when executing on an A72 pcpu, and, later, after wakeup,
>>restart executing on an A24 pcpu.
>
>With a bit of work in Xen, it would be possible to do move the vCPU between
>big and LITTLE cpus. As mentioned above, we could sanitize the features to
>only enable a common set. You can view the big.LITTLE problem as a local live
>migration between two kind of CPUs.
>
>In your suggestion you don't mention what would happen if the guest
>configuration does not contain the affinity. Does it mean the vCPU will be
>scheduled anywhere? A pCPU/class will be chosen randomly?


From the doc I read, https://wiki.xen.org/wiki/Tuning_Xen_for_Performance
Default hard affinity is all set 1, so vcpus can be scheduled to all pcpus.
But scheduler will choose prefered pcpus according to soft affinity.

>
>To be honest, I quite like this idea. It could be used as soft/hard affinity
>for the moment. But can be extended in the future if/when the scheduler gain
>knowledge of power efficiency and vCPU can migrate between big and LITTLE.

To GUEST, vCPUs have the same vcpu type, but it can be scheduled on big and 
LITTLE
pcpu. I can not foresee how much efforts needed for this.
This is a different direction from we discussed earlier in the thread.

For power efficiency, such as cpufreq and etc, seems little was done for 
xen/arm.
It is good that xen could take the advantage of big and LITTLE in future.
I am not sure that linux task high load means vCPU high load and
xen migrate the vCPU to big pcpu?

Regards,
Peng.

>
>Regards,
>
>-- 
>Julien Grall

-- 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC 0/5] xen/arm: support big.little SoC

Reply via email to