Hi Maarten, Thank you very much for your answer and your pointers. The page (which I did not know existed) containing a federated SPARQL query is definitely close to what I mean. It just misses one more step: deciding who is right. If we look at the first result of the table <https://www.wikidata.org/wiki/Property_talk:P1006/Mismatches> of mismatches (Dmitry Bortniansky <https://www.wikidata.org/wiki/Q316505>) and we draw a little graph, the result is:
[image: Diagram.png] We can see that the error comes (probably) from Viaf, which contains a duplicate, and from NTA, which obviously created an authority based on this bad Viaf ID. My research is very close to this kind of case, and I am very interested to know what is already implemented in Wikidata. Cheers, Ettore Rizza On Sat, 29 Sep 2018 at 13:03, Maarten Dammers <[email protected]> wrote: > Hi Ettore, > > > On 26-09-18 14:31, Ettore RIZZA wrote: > > Dear all, > > > > I'm looking for Wikidata bots that perform accuracy audits. For > > example, comparing the birth dates of persons with the same date > > indicated in databases linked to the item by an external-id. > Let's have a look at the evolution of automated editing. The first step > is to add missing data from anywhere. Bots importing date of birth are > an example of this. The next step is to add data from somewhere with a > source or add sources to existing unsourced or badly sourced statements. > As far as I can see that's where we are right now, see for example edits > like > > https://www.wikidata.org/w/index.php?title=Q41264&type=revision&diff=619653838&oldid=616277912 > is . Of course the next step would be to be able to compare existing > sourced statements with external data to find differences. But how would > the work flow be? Take for example Johannes Vermeer ( > https://www.wikidata.org/wiki/Q41264 ). Extremely well documented and > researched, but > > http://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500032927 > and https://rkd.nl/nl/explore/artists/80476 combined provide 3 different > dates of birth and 3 different dates of death. When it comes to these > kind of date mismatches, it's generally first come, first served (first > date added doesn't get replaced). This mismatch could show up in some > report. I can check it as a human and maybe do some adjustments, but how > would I sign it of to prevent other people from doing the same thing > over and over again? > > With federated SPARQL queries it becomes much easier to generate reports > of mismatches. See for example > https://www.wikidata.org/wiki/Property_talk:P1006/Mismatches . > > Maarten > > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
