On 04/10/2018 11:59 PM, Dario Faggioli wrote: > [Adding Andrew, not because I expect anything, but just because we've > chatted about this issue on IRC :-) ] > > On Tue, 2018-04-10 at 22:37 +0200, Olaf Hering wrote: >> On Tue, Apr 10, Dario Faggioli wrote: >> >> BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); >> >> (XEN) Xen BUG at sched_credit.c:876 >> (XEN) ----[ Xen-4.11.20180410T125709.50f8ba84a5- >> 3.bug1087289_411 x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 118 >> (XEN) RIP: e008:[<ffff82d080229ab4>] >> sched_credit.c#csched_vcpu_migrate+0x27/0x51 >> ... >> (XEN) Xen call trace: >> (XEN) [<ffff82d080229ab4>] >> sched_credit.c#csched_vcpu_migrate+0x27/0x51 >> (XEN) [<ffff82d080236348>] schedule.c#vcpu_move_locked+0xbb/0xc2 >> (XEN) [<ffff82d08023764c>] schedule.c#vcpu_migrate+0x226/0x25b >> (XEN) [<ffff82d08023935f>] context_saved+0x8d/0x94 >> (XEN) [<ffff82d08027797d>] context_switch+0xe66/0xeb0 >> (XEN) [<ffff82d080236943>] schedule.c#schedule+0x5f4/0x627 >> (XEN) [<ffff82d080239f15>] softirq.c#__do_softirq+0x85/0x90 >> (XEN) [<ffff82d080239f6a>] do_softirq+0x13/0x15 >> (XEN) [<ffff82d08031f5db>] vmx_asm_do_vmentry+0x2b/0x30 >> > Hey... unless I've really put there a totally bogous BUG_ON(), this > looks interesting and potentially useful. > > It says that the vcpu which is being context switched out, and on which > we are calling vcpu_migrate() on, because we found it to be > VPF_migrating, is actually in the runqueue already when we actually get > to execute vcpu_migrate()->vcpu_move_locked(). > > Mmm... let's see. > > CPU A CPU B > . . > schedule(current == v) vcpu_set_affinity(v) > prev = current // == v . > schedule_lock(CPU A) . > csched_schedule() schedule_lock(CPU A) > if (runnable(v)) //YES x > runq_insert(v) x > return next != v x > schedule_unlock(CPU A) x // takes the lock > context_switch(prev,next) set_bit(v, VPF_migrating) [*] > context_saved(prev) // still == v . > v->is_running = 0 schedule_unlock(CPU A) > SMP_MB . > if (test_bit(v, VPF_migrating)) // YES!! > vcpu_migrate(v) . > for { . > schedule_lock(CPU A) . > SCHED_OP(v, pick_cpu) . > set_bit(v, CSCHED_MIGRATING) . > return CPU C . > pick_called = 1 . > schedule_unlock(CPU A) . > schedule_lock(CPU A + CPU C) . > if (pick_called && ...) // YES . > break . > } . > // v->is_running is 0 . > //!test_and_clear(v, VPF_migrating)) is false!! > clear_bit(v, VPF_migrating) . > vcpu_move_locked(v, CPU C) . > BUG_ON(__vcpu_on_runq(v)) . > > [*] after this point, and until someone manages to call vcpu_sleep(), > v sits in CPU A's runqueue with the VPF_migrating pause flag set > > So, basically, the race is between context_saved() and > vcpu_set_affinity(). Basically, vcpu_set_affinity() sets the > VPF_migrating pause flags on a vcpu in a runqueue, with the intent of > letting either a vcpu_sleep_nosync() or a reschedule remove it from > there, but context_saved() manage to see the flag, before the removal > can happen. > > And I think this explains also the original BUG at sched_credit.c:1694 > (it's just a bit more involved).
Yep, that looks correct. I had considered some sort of race between set_affinity() and context_switch(), but just never noticed that it could omit to take the vcpu off the runqueue. > As it can be seen above (and also in the code comment in ) there is a > barrier (which further testify that this is indeed a tricky passage), > but I guess it is not that effective! :-/ > > TBH, I have actually never fully understood what that comment really > meant, what the barrier was protecting, and how... e.g., isn't it > missing its paired one? In fact, there's another comment, clearly > related, right in vcpu_set_affinity(). But again I'm a bit at loss at > properly figuring out what the big idea is. I think the idea is to make sure that the change to v->is_running happens before whatever happens to come next (i.e., that the compiler doesn't reorder the write as part of its normal optimization activities). As it happens nothing that comes next looks like it really needs such ordering (particularly as you can't reorder things over a function call, AFAIUI), but it's good to have those in place in case anybody *does* add that sort of thing. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel