On 17/09/2021 10:27, Julien Grall wrote:
> Hi,
> 
> (+ some AWS folks)
> 
> On 17/09/2021 11:17, Jan Beulich wrote:
>> On 16.09.2021 19:52, Andrew Cooper wrote:
>>> On 16/09/2021 13:30, Jan Beulich wrote:
>>>> On 16.09.2021 13:10, Dmitry Isaikin wrote:
>>>>> From: Dmitry Isaykin <[email protected]>
>>>>>
>>>>> This significantly speeds up concurrent destruction of multiple
>>>>> domains on x86.
>>>> This effectively is a simplistic revert of 228ab9992ffb ("domctl:
>>>> improve locking during domain destruction"). There it was found to
>>>> actually improve things;
>>>
>>> Was it?  I recall that it was simply an expectation that performance
>>> would be better...
>>
>> My recollection is that it was, for one of our customers.
>>
>>> Amazon previously identified 228ab9992ffb as a massive perf hit, too.
>>
>> Interesting. I don't recall any mail to that effect.
> 
> Here we go:
> 
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fde46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia%40amazon.com%2F&amp;data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C8cf65b3fb3324abe7cf108d979bd7171%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637674676843910175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=si7eYIxSqsJY77sWuwsad5MzJDMzGF%2F8L0JxGrWTmtI%3D&amp;reserved=0
> 
> 
> We have been using the revert for quite a while in production and didn't
> notice any regression.
> 
>>
>>> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or
>>> incomplete, and it appears as if it wasn't necessarily a wise move in
>>> hindsight.
>>
>> Possible; I continue to think though that the present observation wants
>> properly understanding instead of more or less blindly undoing that
>> change.
> 
> To be honest, I think this is the other way around. You wrote and merged
> a patch with the following justification:
> 
> "
>     There is no need to hold the global domctl lock across domain_kill() -
>     the domain lock is fully sufficient here, and parallel cleanup after
>     multiple domains performs quite a bit better this way.
> "
> 
> Clearly, the original commit message is lacking details on the exact
> setups and numbers. But we now have two stakeholders with proof that
> your patch is harmful to the setup you claim perform better with your
> patch.
> 
> To me this is enough justification to revert the original patch. Anyone
> against the revert, should provide clear details of why the patch should
> not be reverted.

I second a revert.

I was concerned at the time that the claim was unsubstantiated, and now
there is plenty of evidence to counter the claim.

~Andrew

Reply via email to