Hi Elisavet, You can identify reverts using the sha1 checksum of revisions You can use the mwreverts library[0] to do that in the dump. Editquality[1] repository has such a use case for detecting reverts. You will not be able to detect partial reverts but it will detect identity reverts which form majority of the reverts.
- Regards Sumit Asthana [0] - https://pythonhosted.org/mwreverts/ [1] - https://github.com/wikimedia/editquality/blob/master/editquality/utilities/extract_damaging.py#L160 On Fri, Sep 11, 2020 at 2:55 AM Elisavet Koutsiana < [email protected]> wrote: > Hello, > > I wanted to ask if there is any canonical way to identify deletion, > reverts etc in the edit history xml files. I can understand that the action > of every revision is described in the "comment" element of the xml format, > but is there a code name or number or anything else that will help me to > identify one revision for example as deletion? > > Thank you, > Elisavet > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
