LennardHofmann added subscribers: hoo, LennardHofmann.
LennardHofmann added a comment.


  Looking at the source code of mw.wikibase.lua 
<https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/client/includes/DataAccess/Scribunto/mw.wikibase.lua$113>,
 we see:
  
    -- Use a deep clone here, so that people can't modify the cached entity
    return entityModule.create( mw.clone( entity ) )
  
  Using `git log -S'clone here'` I was able to find a commit by @hoo that added 
the cloning and linked to T76946 <https://phabricator.wikimedia.org/T76946>, 
which tells us that the motivation behind the change was to make the cached 
tables immutable so that `#invoke` cannot have side effects that influence the 
next `#invoke`.
  
  I guess hoo decided not to make the cached tables read-only using metatables 
because `rawset` can bypass the protection. I remembered that `mw.loadData` 
uses read-only tables so I wondered how it deals with `rawset`. I found 
rELUAd9c7d5f971f1 
<https://phabricator.wikimedia.org/rELUAd9c7d5f971f18be3c17c512a02249cd1254b48d4>,
 a patch that makes it possible to modify the table returned by `mw.loadData` 
without using `rawset` or introducing cross-invoke leakage.
  
  Below is a proof-of-concept based on the patch. Please don't ask me how 
`dataWrapper` works.
  
    -- This creates an read-only dummy table for accessing the real data.
    --
    -- @param data table Data to access
    -- @param seen table|nil Table of already-seen tables.
    -- @return table
    local function dataWrapper( data, seen )
        local t, dummyValue, changes = {}, {}, nil
        seen = seen or { [data] = t }
        local function pairsfunc( s, k )
                k = next( changes or data, k )
                if k ~= nil then
                        return k, t[k]
                end
                return nil
        end
        local function ipairsfunc( s, i )
                i = i + 1
                if t[i] ~= nil then
                        return i, t[i]
                end
                -- no nil to match default ipairs()
        end
        local mt = {
                __index = function ( tt, k )
                        assert( t == tt )
                        if changes and changes[k] ~= dummyValue then
                                return changes[k]
                        end
                        local v = data[k]
                        if type( v ) == 'table' then
                                seen[v] = seen[v] or dataWrapper( v, seen )
                                return seen[v]
                        end
                        return v
                end,
                __newindex = function ( tt, k, v )
                        assert( t == tt )
                        if not changes then
                                changes = {}
                                for k in pairs( data ) do
                                        changes[k] = dummyValue
                                end
                        end
                        changes[k] = v
                end,
                __pairs = function ( tt )
                        assert( t == tt )
                        return pairsfunc, t, nil
                end,
                __ipairs = function ( tt )
                        assert( t == tt )
                        return ipairsfunc, t, 0
                end,
        }
    
        -- This is just to make setmetatable() fail
        mt.__metatable = mt
    
        return setmetatable( t, mt )
    end
    
    local cached = { P31 = 'Q5' }
    
    local function getEntity()
        return dataWrapper( cached )
    end
    
    -- This demonstrates that cross-invoke communication is impossible
    local e1 = getEntity()
    local e2 = getEntity()
    e1.P31 = 'muhaha'
    local e3 = getEntity()
    print( e1.P31, e2.P31, e3.P31 ) --> muhaha  Q5  Q5
  
  This comes with the same limitations `mw.loadData` has: the length operator 
`#`, `next()`, and the functions in the table library do not work. But we can 
index the table and iterate over it, which is all that really matters. IMHO, 
being able to modify the entity table is unnecessary so using `mw.loadData`'s 
implementation instead of the above patch would also suffice.
  
  On Commons category pages that use the Wikidata Infobox 
<https://commons.wikimedia.org/wiki/Template:Wikidata_Infobox>, the Lua 
profiler often reports that `recursiveClone` takes up most of the time—try it 
yourself here 
<https://commons.wikimedia.org/w/index.php?title=Category:Amrita_Lal_Basu&action=edit>.
 So getting rid of `mw.clone` would be a huge win for performance, especially 
since `getBestStatements` and `getAllStatements` are also effected by this.

TASK DETAIL
  https://phabricator.wikimedia.org/T203357

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: LennardHofmann
Cc: LennardHofmann, hoo, Liuxinyu970226, Aklapper, Danmichaelo, jeblad, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to