dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.
TASK DESCRIPTION
When an item is deleted the streaming updater should produce a message
instructing the consumer to delete the item from the graph.
Classic page deletions are made by admins and propagated through the
`mediawiki.page-delete` stream.
Example message:
{
"$schema": "/mediawiki/page/delete/1.0.0",
"meta": {
"uri": "https://test.wikidata.org/wiki/Q212433",
"request_id": "59f87c41-7680-4f8a-bf6e-7dac91530972",
"id": "00fcac35-5357-4c99-ba9f-e720db9f0197",
"dt": "2020-07-01T13:16:25Z",
"domain": "test.wikidata.org",
"stream": "mediawiki.page-delete"
},
"database": "testwikidatawiki",
"performer": {
"user_text": "DCausse (WMF)",
"user_groups": [
"bureaucrat",
"sysop",
"*",
"user"
],
"user_is_bot": false,
"user_id": 2490,
"user_registration_dt": "2017-09-28T06:49:13Z",
"user_edit_count": 7
},
"page_id": 302928,
"page_title": "Q212433",
"page_namespace": 0,
"page_is_redirect": false,
"rev_id": 529859,
"rev_count": 1,
"comment": "content was: \"Test dcausse v2\", and the only contributor
was \"[[Special:Contributions/DCausse (WMF)|DCausse (WMF)]]\" ([[User
talk:DCausse (WMF)|talk]])",
"parsedcomment": "content was: "Test dcausse v2", and the only
contributor was "<a href=\"/wiki/Special:Contributions/DCausse_(WMF)\"
title=\"Special:Contributions/DCausse (WMF)\">DCausse (WMF)</a>" (<a
href=\"/w/index.php?title=User_talk:DCausse_(WMF)&action=edit&redlink=1\"
class=\"new\" title=\"User talk:DCausse (WMF) (page does not
exist)\">talk</a>)"
}
This task involves:
On the shared model:
- add a new operation type "delete" to
`org.wikidata.query.rdf.tool.stream.MutationEventData`
- add tests to
org.wikidata.query.rdf.tool.stream.MutationEventDataJsonSerializationUnitTest
to make sure that it's serialized properly
On the flink pipeline:
- add a new case class `PageDelete` in the `IntputEvent` ADT
- add a new case class `DeleteItem` in the `MutationOperation` ADT
- add a new stream to consume from (kafka topic `mediawiki.page-delete`) and
produce `PageDelete` to downstream operators
- add a new case in `DecideMutationOperation`:
- produce a `DeleteItem` operation if the map contains a revision of the
item and delete it from the map
- produce a `IgnoredMutation` otherwise
- add a new case in
org.wikidata.query.rdf.updater.GenerateEntityDiffPatchOperation to support the
`DeleteItem` operation and simply produce an EntityPathOp with a
`MutationEventData` that has the type "delete".
On the pipeline consumer:
- Refactor RDFPatch so that it has two modes: (applying a diff, delete an
item)
- Refactor org.wikidata.query.rdf.tool.stream.KafkaStreamConsumer so that it
accumulates delete items
- Adapt org.wikidata.query.rdf.tool.rdf.RdfRepositoryUpdater#applyPatch to
support item deletions
AC:
When deleting an item from wikibase:
- an event should be present in the streaming updater output indicating that
this item needs to be deleted
- the data should disappear from the query service when using the streaming
updater
size: XL
TASK DETAIL
https://phabricator.wikimedia.org/T256875
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, dcausse, CBogen, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86,
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst,
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll,
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs