On 09.08.21 20:18, Julien Grall wrote:
Hi Julien
On 09/08/2021 18:14, Oleksandr wrote:
On 09.08.21 17:51, Julien Grall wrote:
Hi Julien.
Hi Oleksandr,
I am writing down here what we discussed on another thread and on
IRC. This will be easier to track in a single thread.
On 04/08/2021 23:00, Julien Grall wrote:
On 04/08/2021 21:56, Oleksandr wrote:
Now, I am wondering, would it be possible to update/clarify the
current "reg" purpose and use it to pass a safe unallocated space
for any Xen specific mappings (grant, foreign, whatever) instead
of just for the grant table region. In case, it is not allowed for
any reason (compatibility PoV, etc), would it be possible to
extend a property by passing an extra range separately, something
similar to how I described above?
I think it should be fine to re-use the same region so long the
size of the first bank is at least the size of the original region.
While answering to the DT binding question on the DT ML, I realized
that this is probably not going to be fine because there is a bug in
Xen when mapping grant-table frame.
The function gnttab_map_frame() is used to map the grant table
frame. If there is an old mapping, it will first remove it.
The function is using the helper gnttab_map_frame() to find the
corresponding GFN or return INVALID_GFN if not mapped.
On Arm, gnttab_map_frame() is implementing using an array index by
the grant table frame number. The trouble is we don't update the
array when the page is unmapped. So if the GFN is re-used before the
grant-table is remapped, then we will end up to remove whatever was
mapped there (this could be a foreign page...).
This behavior already happens today as the toolstack will use the
first GFN of the region if Linux doesn't support the acquire
resource interface. We are getting away in the Linux because the
toolstack only map the first grant table frame and:
- Newer Linux will not used the region provided by the DT and
nothing will be mapped there.
- Older Linux will use the region but still map the grant table
frame 0 to the same GFN.
I am not sure about U-boot and other OSes here.
This is not new but it is going to be become a bigger source of
problem (read more chance to hit it) as we try to re-use the first
region.
This means the first region should exclusively used for the
grant-table (in a specific order) until the issue is properly fixed.
Thank you for the explanation, it is clear now.
A potential fix is to update the array in p2m_put_l3_page(). The
default max size of the array is 1024, so it might be fine to just
walk it (it would be simply a comparison).
I think, this would work. Looks like we don't need to walk for each
gfn which is being freed, we could just filter it by p2m_is_ram() ...
Well. This would still potentially result to a few unnecessary walk. I
would consider to introduce a new P2M type or possibly add a check if
the page is in xenheap (grant-table are xenheap pages so far).
Indeed, this would be better, personally I would prefer to check if page
is in xenheap.
Cheers,
--
Regards,
Oleksandr Tyshchenko