dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a maintainer of WDQS I want the streaming updater to be able to reconcile 
a wikibase item so that I can fix some inconsistencies without reloading the 
full database.
  
  This can be achieved by introducing a new topic the streaming updater would 
consume and would contain two type of messages:
  
  - reconcile a specific item revision
  - reconcile a deleted item
  
  This can be used to reconcile missed events (MW bugs, missing events, late 
events), the third mode will be used on fetch failures.
  When a delete is required existing code will be used.
  When the item is existing the mutation message will contain all the entity 
data and the consumer will work like the old updater and will perform a full 
reconciliation.
  
  Automatic reconciliation (probably via a batch running from the analytics 
cluster) should be possible reading side-outputs:
  
  - late events 
<https://schema.wikimedia.org/repositories//secondary/jsonschema/rdf_streaming_updater/lapsed_action/latest.yaml>
  - failed events 
<https://schema.wikimedia.org/repositories/secondary/jsonschema/rdf_streaming_updater/fetch_failure/latest.yaml>
  
  Ad-hoc reconciliation should be possible via a script (or possibly from 
wikibase itself if this is deemed necessary).
  
  The schema of this new topic should be as follow:
  
  - meta: typical event metadata
  - item: string the wikibase item to update
  - revision: long the revision with
  - type: enum: create or delete
  
  The decide mutation operation should be changed to support a new operation:
  
  - if the revision in the message is older than the one seen in the state then 
an operation corresponding to the state is emitted:
    - `reconcile` if the state is `CREATED` using the revision seen and fetch 
the data from this revision
    - `delete` if the state is `DELETED`
  - if the revision in the message is newer than the one seen in the state (or 
never seen) then an operation corresponding to the message is emitted:
    - `reconcile` if the message has a type `create` using the revision from 
the message
    - `delete` if the message has a type `delete`
  
  AC:
  
  - a new type of operation `reconcile` is added to MutationEventData
  - streaming-updater-producer operators are adapted to support this new message
  - a new schema is added to 
https://schema.wikimedia.org/repositories/secondary/jsonschema/rdf_streaming_updater
  - the streaming-updater-consumer supports the `reconcile` operation

TASK DETAIL
  https://phabricator.wikimedia.org/T279541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to