Re: [PATCH v2] mm/pdx: Add comments throughout the codebase for pdx

Jan Beulich Mon, 10 Jul 2023 00:43:46 -0700

On 07.07.2023 17:55, Alejandro Vallejo wrote:
> On Thu, Jul 06, 2023 at 11:50:58AM +0200, Jan Beulich wrote:
>> On 22.06.2023 16:02, Alejandro Vallejo wrote:
>>> @@ -57,9 +100,25 @@ uint64_t __init pdx_init_mask(uint64_t base_addr)
>>>                           (uint64_t)1 << (MAX_ORDER + PAGE_SHIFT)) - 1);
>>>  }
>>>  
>>> -u64 __init pdx_region_mask(u64 base, u64 len)
>>> +uint64_t __init pdx_region_mask(uint64_t base, uint64_t len)
>>>  {
>>> -    return fill_mask(base ^ (base + len - 1));
>>> +    uint64_t last = base + len - 1;
>>> +    /*
>>> +     * The only bit that matters in base^last is the MSB. There are 2 
>>> cases.
>>> +     *
>>> +     * case msb(base) < msb(last):
>>> +     *     then msb(fill_mask(base^last)) == msb(last). This is non
>>> +     *     compressible.
>>> +     * case msb(base) == msb(last):
>>> +     *     This means that there _may_ be a sequence of compressible zeroes
>>> +     *     for all addresses between `base` and `last` iff `base` has 
>>> enough
>>> +     *     trailing zeroes. That is, it's compressible when
>>
>> Why trailing zeros? [100000f000,10ffffffff] has compressible bits
>> 32-35, but the low bits of base don't matter at all.


This is ...

>>> + * ## PDX compression
>>> + *
>>> + * This is a technique to avoid wasting memory on machines known to have
>>> + * split their machine address space in several big discontinuous and 
>>> highly
>>> + * disjoint chunks.
>>> + *
>>> + * In its uncompressed form the frame table must have book-keeping metadata
>>> + * structures for every page between [0, max_mfn) (whether they are backed
>>> + * by RAM or not), and a similar condition exists for the direct map. We
>>> + * know some systems, however, that have some sparsity in their address
>>> + * space, leading to a lot of wastage in the form of unused frame table
>>> + * entries.
>>> + *
>>> + * This is where compression becomes useful. The idea is to note that if
>>> + * you have several big chunks of memory sufficiently far apart you can
>>> + * ignore the middle part of the address because it will always contain
>>> + * zeroes as long as the base address is sufficiently well aligned and the
>>> + * length of the region is much smaller than the base address.
>>
>> As per above alignment of the base address doesn't really matter.
> Where above?

... what "above" here meant.

> As far as I understand you need enough alignment to cover the
> hole or you won't have zeroes to compress. Point in case:
> 
>   * region1: [0x0000000000000000 -
>               0x00000000FFFFFFFF]
> 
>   * region2: [0x0001FFFFFFFFF000 -
>               0x00020000FFFFFFFF]
> 
> I can agree this configuration is beyond dumb and statistically unlikely to
> exist in the wild, but it should (IMO) still be covered by that comment.

Right, but this isn't relevant here - in such a case no compression
can occur, yes, but not (just) because of missing alignment. See the
example I gave above (in the earlier reply) for where alignment
clearly doesn't matter for compression to be possible.

Jan

Re: [PATCH v2] mm/pdx: Add comments throughout the codebase for pdx

Reply via email to