On Mon, 2016-09-19 at 21:33 +0800, Peng Fan wrote: > On Mon, Sep 19, 2016 at 11:33:58AM +0100, George Dunlap wrote: > > > > No, I think it would be a lot simpler to just teach the scheduler > > about > > different classes of cpus. credit1 would probably need to be > > modified > > so that its credit algorithm would be per-class rather than pool- > > wide; > > but credit2 shouldn't need much modification at all, other than to > > make > > sure that a given runqueue doesn't include more than one class; and > > to > > do load-balancing only with runqueues of the same class. > > I try to follow. > - scheduler needs to be aware of different classes of cpus. ARM > big.Little cpus. > Yes, I think this is essential.
> - scheduler schedules vcpus on different physical cpus in one > cpupool. > Yep, that's what the scheduler does. And personally, I'd start implementing big.LITTLE support for a situation where both big and LITTLE cpus coexists in the same pool. > - different cpu classes needs to be in different runqueue. > Yes. So, basically, imagine to use vcpu pinning to support big.LITTLE. I've spoken briefly about this in my reply to Juergen. You probably can even get something like this up-&-running by writing very few or zero code (you'll need --for now-- max_dom0_vcpus, dom0_vcpus_pin, and then, in domain config files, "cpus='...'"). Then, the real goal, would be to achieve the same behavior automatically, by acting on runqueues' arrangement and load balancing logic in the scheduler(s). Anyway, sorry for my ignorance on big.LITTLE, but there's something I'm missing: _when_ is it that it is (or needs to be) decided whether a vcpu will run on a big or LITTLE core? Thinking to a bare metal system, I think that cpu X is, for instance, big, and will always be like that; similarly, cpu Y is LITTLE. This makes me think that, for a virtual machine, it is ok to choose/specify at _domain_creation_ time, which vcpus are big and which vcpus are LITTLE, is this correct? If yes, this also means that --whatever way we find to make this happen, cpupools, scheduler, etc-- the vcpus that we decided they are big, must only be scheduled on actual big pcpus, and pcpus that we decided they are LITTLE, must only be scheduled on actual LITTLE pcpus, correct again? > Then for implementation. > - When create a guest, specific physical cpus that the guest will be > run on. > I'd actually do that the other way round. I'd ask the user to specify how many --and, if that's important-- vcpus are big and how many/which are LITTLE. Knowing that, we also know whether the domain is a big only, LITTLE only or big.LITTLE one. And we also know on which set of pcpus each set of vcpus should be restrict to. So, basically (but it's just an example) something like this, in the xl config file of a guest: 1) big.LITTLE guest, with 2 big and 2 LITTLE pcpus. User doesn't care which is which, so a default could be 0,1 big and 2,3 LITTLE: vcpus = 4 vcpus.big = 2 2) big.LITTLE guest, with 8 vcpus, of which 0,2,4 and 6 are big: vcpus = 8 vcpus.big = [0, 2, 4, 6] Which would be the same as vcpus = 8 vcpus.little = [1, 3, 5, 7] 3) guest with 4 vcpus, all big: vcpus = 4 vcpus.big = "all" Which would be the same as: vcpus = 4 vcpus.little = "none" And also the same as just: vcpus = 4 Or something like this > - If the physical cpus are different cpus, indicate the guest would > like to be a big.little guest. > And have big vcpus and little vcpus. > Not liking this as _the_ way of specifying the guest topology, wrt big.LITTLE-ness (see alternative proposal right above. :-)) However, right now we support pinning/affinity already. We certainly need to decide what to do if, e.g., no vcpus.big or vcpus.little are present, but the vcpus have hard or soft affinity to some specific pcpus. So, right now, this, in the xl config file: cpus = [2, 8, 12, 13, 15, 17] means that we want to ping 1-to-1 vcpu 0 to pcpu 2, vcpu 1 to pcpu 8, vcpu 2 to pcpu 12, vcpu 3 to pcpu 13, vcpu 4 to pcpu 15 and vcpu 5 to pcpu 17. Now, if cores 2, 8 and 12 are big, and no vcpus.big or vcpu.little is specified, I'd put forward the assumption that the user wants vcpus 0, 1 and 2 to be big, and vcpus 3, 4, and 5 to be LITTLE. If, instead, there are vcpus.big or vcpus.little specified, and there's disagreement, I'd either error out or decide which overrun the other (and print a WARNING about that happening). Still right now, this: cpus = "2-12" means that all the vcpus of the domain have hard affinity (i.e., are pinned) to pcpus 2-12. And in this case I'd conclude that the user wants for all the vcpus to be big. I'm less sure what to do if _only_ soft-affinity is specified (via "cpus_soft="), or if hard-affinity contains both big and LITTLE pcpus, like, e.g.: cpus = "2-15" > - If no physical cpus specificed, then the guest may runs on big > cpus or on little cpus. But not both. > Yes. if nothing (or something contradictory) is specified, we "just" have to decide what's the sanest default. > How to decide runs on big or little physical cpus? > I'd default to big. > - For Dom0, I am still not sure,default big.little or else? > Again, if nothing is specified, I'd probably default to: - give dom0 as much vcpus are there are big cores - restrict them to big cores But, of course, I think we should add boot time parameters like these ones: dom0_vcpus_big = 4 dom0_vcpus_little = 2 which would mean the user wants dom0 to have 4 big and 2 LITTLE cores... and then we act accordingly, as described above, and in other emails. > If use scheduler to handle the different classes cpu, we do not need > to use cpupool > to block vcpus be scheduled onto different physical cpus. And using > scheudler to handle this > gives an opportunity to support big.little guest. > Exactly, this is one strong point in favour of this solution, IMO! Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
Description: This is a digitally signed message part
_______________________________________________ Xen-devel mailing list Xenfirstname.lastname@example.org https://lists.xen.org/xen-devel