LennardHofmann added subscribers: hoo, LennardHofmann. LennardHofmann added a comment.
Looking at the source code of mw.wikibase.lua <https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/client/includes/DataAccess/Scribunto/mw.wikibase.lua$113>, we see: -- Use a deep clone here, so that people can't modify the cached entity return entityModule.create( mw.clone( entity ) ) Using `git log -S'clone here'` I was able to find a commit by @hoo that added the cloning and linked to T76946 <https://phabricator.wikimedia.org/T76946>, which tells us that the motivation behind the change was to make the cached tables immutable so that `#invoke` cannot have side effects that influence the next `#invoke`. I guess hoo decided not to make the cached tables read-only using metatables because `rawset` can bypass the protection. I remembered that `mw.loadData` uses read-only tables so I wondered how it deals with `rawset`. I found rELUAd9c7d5f971f1 <https://phabricator.wikimedia.org/rELUAd9c7d5f971f18be3c17c512a02249cd1254b48d4>, a patch that makes it possible to modify the table returned by `mw.loadData` without using `rawset` or introducing cross-invoke leakage. Below is a proof-of-concept based on the patch. Please don't ask me how `dataWrapper` works. -- This creates an read-only dummy table for accessing the real data. -- -- @param data table Data to access -- @param seen table|nil Table of already-seen tables. -- @return table local function dataWrapper( data, seen ) local t, dummyValue, changes = {}, {}, nil seen = seen or { [data] = t } local function pairsfunc( s, k ) k = next( changes or data, k ) if k ~= nil then return k, t[k] end return nil end local function ipairsfunc( s, i ) i = i + 1 if t[i] ~= nil then return i, t[i] end -- no nil to match default ipairs() end local mt = { __index = function ( tt, k ) assert( t == tt ) if changes and changes[k] ~= dummyValue then return changes[k] end local v = data[k] if type( v ) == 'table' then seen[v] = seen[v] or dataWrapper( v, seen ) return seen[v] end return v end, __newindex = function ( tt, k, v ) assert( t == tt ) if not changes then changes = {} for k in pairs( data ) do changes[k] = dummyValue end end changes[k] = v end, __pairs = function ( tt ) assert( t == tt ) return pairsfunc, t, nil end, __ipairs = function ( tt ) assert( t == tt ) return ipairsfunc, t, 0 end, } -- This is just to make setmetatable() fail mt.__metatable = mt return setmetatable( t, mt ) end local cached = { P31 = 'Q5' } local function getEntity() return dataWrapper( cached ) end -- This demonstrates that cross-invoke communication is impossible local e1 = getEntity() local e2 = getEntity() e1.P31 = 'muhaha' local e3 = getEntity() print( e1.P31, e2.P31, e3.P31 ) --> muhaha Q5 Q5 This comes with the same limitations `mw.loadData` has: the length operator `#`, `next()`, and the functions in the table library do not work. But we can index the table and iterate over it, which is all that really matters. IMHO, being able to modify the entity table is unnecessary so using `mw.loadData`'s implementation instead of the above patch would also suffice. On Commons category pages that use the Wikidata Infobox <https://commons.wikimedia.org/wiki/Template:Wikidata_Infobox>, the Lua profiler often reports that `recursiveClone` takes up most of the time—try it yourself here <https://commons.wikimedia.org/w/index.php?title=Category:Amrita_Lal_Basu&action=edit>. So getting rid of `mw.clone` would be a huge win for performance, especially since `getBestStatements` and `getAllStatements` are also effected by this. TASK DETAIL https://phabricator.wikimedia.org/T203357 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: LennardHofmann Cc: LennardHofmann, hoo, Liuxinyu970226, Aklapper, Danmichaelo, jeblad, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
