daniel added a comment. Revisited after a first exploratory coding session showed the proposed solution to be problematic. An ad-hoc discussion with Thiemo and Jan resulted in going back to the one-aspect-per-language solution. Key points:
- The intent of tracking usage by aspect is to reduce the number of pages to purge when a change notification for an entity is received. Ntoe that purging a page purges all renderings/variants in the cache. - adding a render_key column greatly increases the size of the table - the number of aspects (per item/page combination) is //multiplied// by the number of render keys. - Example: let's say 200.000 image description pages on Commons use Q183 as a "tag", and use the label and local page title (L and T aspects), resulting in 400k rows in the database. If on average each page is viewed in 2 languages, this would result in 800k rows; not only the rows for the L usage would be doubled, but the rows for the T usage too, even though that kind of usage does not care about language. - adding a render_key does not provide any substantial advantage over using one aspect - the expected advantage was to cover cases in which some conditional on the page would result in different items and aspects being used when rendering the page for different users. - however, this is only possible (and sensible) if the conditional depends on a feature that also causes a parser cache split. - Besides user language, that could be things like the page being editable, or the thumbnail size, numbering of headings, date format, etc. - Besides the user language, these settings are mostly inaccessible to conditionals in wikitext/Lua. And if accessible, they are very unlikely to be used. - When receiving a change notification, the associated diff is used to determine which aspect of the entity changed, and thus, which usages are affected by the change. - From the diff, available features for this decision are the "section" (terms, sitelinks, statement, etc), the language (for labels, descriptions and aliases), and the site id (for sitelinks). - Only the features available from the diff can be used to determine the affected aspects. So if we tracked different usages per page depending on the user's thumbnail size, this information would not be helpful to achieve the goal to limit the number of pages to purge, since the diff contains no feature we could filter the thumbnail size in the render_key against. Caveat affecting both options (render_key column, or "L/de"-style aspects): Updating the table is difficult - when the page is //edited//, all tracking rows referring to it (with any render_key / language) should be removed/invalidated. - when a page is rendered, only rows referring to the current render_key/language should be added/updated/removed. - It's unclear whether there is any guarantee over the order in which hooks fire when a page is edited. TASK DETAIL https://phabricator.wikimedia.org/T90563 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>. EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel Cc: daniel, Aklapper, Rical, hoo, Lydia_Pintscher, Daniel_Mietchen, Wikidata-bugs, aude _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
