Lucas_Werkmeister_WMDE added a comment.
I tinkered a bit more with this in production (`mwscript shell eowiki` on
mwdebug1001). Specifically, I was curious what the entity usages for
eowiki:Perseo looked like before they were deduplicated / combined, so I hacked
together some code to render the revision from scratch.
$s = mws()
$rr = $s->getRevisionRenderer()
$rl = $s->getRevisionLookup()
$rev = $rl->getRevisionByPageId( 115136 )
$rendered = $rr->getRenderedRevision( $rev )
$po = $rendered->getSlotParserOutput( 'main' )
$usages = $po->getExtensionData( 'wikibase-entity-usage' )
And then fed them into the `UsageDeduplicator` (getting it from the
`UsageAccumulatorFactory`’s private field out of laziness – `sudo` is a special
PsySh command to bypass access checks):
$uaf = wbc::getUsageAccumulatorFactory()
sudo $ud = $uaf->usageDeduplicator
$euf = new Wikibase\Client\Usage\EntityUsageFactory(
wbc::getEntityIdParser() )
$usageObjects = array_map( fn ( $str ) => $euf->newFromIdentity( $str ),
array_keys( $usages ) )
It turns out the original usages have //52// different statement usages for
Q130832 (the connected item). This is well above the configured
`$wgWBClientSettings['entityUsageModifierLimits']['C']` in production (33), so
they correctly get collapsed into a single “C” usage:
> count( $usageObjects )
= 93
> count( $ud->deduplicate( $usageObjects ) )
= 42
So at this point, it makes sense that the individual `C.P%` usages are
removed and only the `C` usage is kept. But then at some later point,
presumably in `DataUpdateHookHandler::onParserCacheSaveComplete()` /
`AddUsagesForPageJob`, we somehow have //fewer// than 33 statement usages left,
so they get (re)added individually.
I think we have two possible paths to continue on from here:
- Understand where this second, shorter list of usages comes from.
- Fix `DataUpdateHookHandler::onParserCacheSaveComplete()` to not re-add
`C.P%` usages if a `C` usage already exists (and likewise for other aspects),
i.e. replace the current `$newUsages = array_diff_key( $usages, $currentUsages
)` with something a bit smarter.
I think the second part is something we should do sooner or later, but
arguably it’s not the root cause of the problem, so maybe we should look at the
first part first.
TASK DETAIL
https://phabricator.wikimedia.org/T255706
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Lucas_Werkmeister_WMDE
Cc: ArthurTaylor, hoo, Lucas_Werkmeister_WMDE, ItamarWMDE, Ladsgroup, Krinkle,
eprodromou, aaron, Michael, Aklapper, thcipriani, Danny_Benjafield_WMDE,
Isabelladantes1983, Themindcoder, Adamm71, Jersione, Hellket777, LisafBia6531,
Astuthiodit_1, 786, Biggs657, karapayneWMDE, Invadibot, maantietaja, Juan90264,
Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, darthmon_wmde, Rosalie_WMDE,
Kent7301, joker88john, CucyNoiD, Nandana, Gaboe420, Giuliamocci, Cpaulf30,
Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, QZanden, LawExplorer,
Lewizho99, Maathavan, _jensen, rosalieper, Neuronton, Scott_WUaS, Verdy_p,
Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, Jay8g
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]