dcausse added a comment.
After a test run it seems that we are able to backfill, unfortunately we skip
a non negligible number of revision:
+----+---+---+---+-------------------+-------+---------------+-----+
|y |m |d |h |inconsistency |status |event_type |count|
+----+---+---+---+-------------------+-------+---------------+-----+
|2020|11 |18 |15 |newer_revision_seen|CREATED|revision-create|190 |
|2020|11 |18 |16 |newer_revision_seen|CREATED|revision-create|85 |
|2020|11 |18 |17 |newer_revision_seen|CREATED|revision-create|406 |
|2020|11 |18 |18 |newer_revision_seen|CREATED|revision-create|584 |
|2020|11 |18 |19 |newer_revision_seen|CREATED|revision-create|333 |
|2020|11 |18 |20 |newer_revision_seen|CREATED|revision-create|361 |
|2020|11 |18 |21 |newer_revision_seen|CREATED|revision-create|86 |
|2020|11 |18 |22 |newer_revision_seen|CREATED|revision-create|278 |
|2020|11 |18 |23 |newer_revision_seen|CREATED|revision-create|63 |
|2020|11 |19 |0 |newer_revision_seen|CREATED|revision-create|110 |
|2020|11 |19 |1 |newer_revision_seen|CREATED|revision-create|42 |
|2020|11 |19 |2 |newer_revision_seen|CREATED|revision-create|48 |
|2020|11 |19 |3 |newer_revision_seen|CREATED|revision-create|18 |
|2020|11 |19 |4 |newer_revision_seen|CREATED|revision-create|13 |
|2020|11 |19 |5 |newer_revision_seen|CREATED|revision-create|94 |
|2020|11 |19 |6 |newer_revision_seen|CREATED|revision-create|27 |
|2020|11 |19 |7 |newer_revision_seen|CREATED|revision-create|148 |
|2020|11 |19 |8 |newer_revision_seen|CREATED|revision-create|34 |
+----+---+---+---+-------------------+-------+---------------+-----+
There are revision create events we receive but for which we received a newer
revision (few cases I manually checked where unordered by 1 to 30 secs).
For the wdqs use-case I think it's not a big deal to skip few revisions
(diffing with rev < N-1) this might be confusing for other use cases we don't
yet have.
I'm a bit undecided here, one solution could be to delay revisions where we
know that there's likely one in-between (current != rev_parent_id) but this
moves the buffering logic to the operator which has the entity->rev state.
TASK DETAIL
https://phabricator.wikimedia.org/T267029
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86,
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst,
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll,
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs