Hi Maarten,

Thank you very much for your answer and your pointers. The page (which I
did not know existed) containing a federated SPARQL query is definitely
close to what I mean. It just misses one more step: deciding who is right.
If we look at the first result of the table
<https://www.wikidata.org/wiki/Property_talk:P1006/Mismatches> of
mismatches (Dmitry Bortniansky <https://www.wikidata.org/wiki/Q316505>) and
we draw a little graph, the result is:

[image: Diagram.png]

We can see that the error comes (probably) from Viaf, which contains a
duplicate, and from NTA, which obviously created an authority based on this
bad Viaf ID.

My research is very close to this kind of case, and I am very interested to
know what is already implemented in Wikidata.

Cheers,

Ettore Rizza

On Sat, 29 Sep 2018 at 13:03, Maarten Dammers <[email protected]> wrote:

> Hi Ettore,
>
>
> On 26-09-18 14:31, Ettore RIZZA wrote:
> > Dear all,
> >
> > I'm looking for Wikidata bots that perform accuracy audits. For
> > example, comparing the birth dates of persons with the same date
> > indicated in databases linked to the item by an external-id.
> Let's have a look at the evolution of automated editing. The first step
> is to add missing data from anywhere. Bots importing date of birth are
> an example of this. The next step is to add data from somewhere with a
> source or add sources to existing unsourced or badly sourced statements.
> As far as I can see that's where we are right now, see for example edits
> like
>
> https://www.wikidata.org/w/index.php?title=Q41264&type=revision&diff=619653838&oldid=616277912
> is . Of course the next step would be to be able to compare existing
> sourced statements with external data to find differences. But how would
> the work flow be? Take for example Johannes Vermeer (
> https://www.wikidata.org/wiki/Q41264 ). Extremely well documented and
> researched, but
>
> http://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500032927
> and https://rkd.nl/nl/explore/artists/80476 combined provide 3 different
> dates of birth and 3 different dates of death. When it comes to these
> kind of date mismatches, it's generally first come, first served (first
> date added doesn't get replaced). This mismatch could show up in some
> report. I can check it as a human and maybe do some adjustments, but how
> would I sign it of to prevent other people from doing the same thing
> over and over again?
>
> With federated SPARQL queries it becomes much easier to generate reports
> of mismatches. See for example
> https://www.wikidata.org/wiki/Property_talk:P1006/Mismatches .
>
> Maarten
>
> _______________________________________________
> Wikidata mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to