On Tue, Mar 24, 2015 at 4:54 AM, Andres Lagar Cavilla <
and...@lagarcavilla.org> wrote:

>
>
> On Mon, Mar 23, 2015 at 11:25 AM, Tamas K Lengyel <tkleng...@sec.in.tum.de
> > wrote:
>
>> On Mon, Mar 23, 2015 at 6:59 PM, Andres Lagar Cavilla <
>> and...@lagarcavilla.org> wrote:
>>
>>> On Mon, Mar 23, 2015 at 9:10 AM, Tamas K Lengyel <
>>> tkleng...@sec.in.tum.de> wrote:
>>>
>>>> Hello everyone,
>>>> I'm trying to chase down a bug that reproducibly crashes Xen (tested
>>>> with 4.4.1). The problem is somewhere within the mem-sharing subsystem and
>>>> how that interacts with domains that are being actively saved. In my setup
>>>> I use the xl toolstack to rapidly create clones of HVM domains by piping
>>>> "xl save -c" into xl restore with a modified domain config which updates
>>>> the name/disk/vif. However, during such an operation Xen crashes with the
>>>> following log if there are already active clones.
>>>>
>>>> IMHO there should be no conflict between saving the domain and
>>>> memsharing, as long as the domain is actually just being checkpointed "-c"
>>>> - it's memory should remain as is. This is however clearly not the case.
>>>> Any ideas?
>>>>
>>>
>>> Tamas, I'm not clear on the use of memsharing in this workflow. As
>>> described, you pipe save into restore, but the internal magic is lost on
>>> me. Are you fanning out to multiple restores? That would seem to be the
>>> case, given the need to update name/disk/vif.
>>>
>>> Anyway, I'm inferring. Instead, could you elaborate?
>>>
>>> Thanks
>>> Andre
>>>
>>
>> Hi Andre,
>> thanks for getting back on this issue. The script I'm using is at
>> https://github.com/tklengyel/drakvuf/blob/master/tools/clone.pl. The
>> script simply creates a FIFO pipe (mkfifo) and saves the domain into that
>> pipe which is immediately read by xl restore with the updated configuration
>> file. This mainly just to eliminate having to read the memory dump from
>> disk. That part of the system works as expected and multiple save/restores
>> running at the same time don't cause any side-effects. Once the domain has
>> thus been cloned, I run memshare on every page which also works as
>> expected. This problem only occurs when the cloning procedure runs when a
>> page unshare operation kicks in on a already active clone (as you see in
>> the log).
>>
>
> Sorry Tamas, I'm a bit slow here, I looked at your script -- looks
> allright, no mention of memsharing in there.
>
> Re-reading ... memsharing? memshare? Is this memshrtool in tools/testing?
> How are you running it?
>


Hi Andre,
the memsharing happens here
https://github.com/tklengyel/drakvuf/blob/master/src/main.c#L144 after the
clone script finished. This is effectively the same approach as in
tools/testing, just automatically looping from 0 to max_gpfn. Afterwards
all unsharing happens automatically either induced by the guest itself, or
when I map pages into the my app with xc_map_foreign_range PROT_WRITE.


>
> Certainly no xen crash should happen with user-space input. I'm just
> trying to understand what you're doing. The unshare code is not, uhmm,
> brief, so a NULL deref could happen in half a dozen places at first glance.
>

Well let me know what I could do help tracing it down. I don't think
(potentially buggy) userspace tools should crash Xen either =)

Tamas


>
> Thanks
> Andres
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to