Hi Elisavet,

You can identify reverts using the sha1 checksum of revisions You can use
the mwreverts library[0] to do that in the dump. Editquality[1] repository
has such a use case for detecting reverts. You will not be able to detect
partial reverts but it will detect identity reverts which form majority of
the reverts.

- Regards
Sumit Asthana

[0] - https://pythonhosted.org/mwreverts/
[1] -
https://github.com/wikimedia/editquality/blob/master/editquality/utilities/extract_damaging.py#L160


On Fri, Sep 11, 2020 at 2:55 AM Elisavet Koutsiana <
[email protected]> wrote:

> Hello,
>
> I wanted to ask if there is any canonical way to identify deletion,
> reverts etc in the edit history xml files. I can understand that the action
> of every revision is described in the "comment" element of the xml format,
> but is there a code name or number or anything else that will help me to
> identify one revision for example as deletion?
>
> Thank you,
> Elisavet
> _______________________________________________
> Wikidata mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to