That sounds good in principle, but people might get upset ("why did we put
this into Wikipedia then?")

A compromise could be to import from (in this example) both Wikipedia and
IMDb, add both as a reference to the same claim if they agree, and
separately if not. We can then deal with the contradictions manually.


On Tue, Apr 2, 2013 at 5:21 PM, Michael Hale <[email protected]>wrote:

> I think that's a fine threshold, but it will probably vary some per the
> type of data. Our goal is ultimately that every claim will have a
> reference, and then we can sort of pass the burden of accuracy on to the
> references. Wikipedia has grown well with that principle. Adding references
> will be much easier when we batch import data from sources dedicated to a
> particular type of data as opposed to parsing information out of Wikipedia
> (the IMDb has more consistency checks than Infobox film).
>
> ------------------------------
> Date: Tue, 2 Apr 2013 10:39:22 -0400
> From: [email protected]
>
> To: [email protected]
> Subject: Re: [Wikidata-l] Running "Infobox film" import script
>
> On Tue, Apr 2, 2013 at 12:58 AM, Michael Hale <[email protected]>wrote:
>
> It will definitely have some errors, but I scanned the results for the
> first 100 movies before I started importing them, and I think the value-add
> will be much greater than the number of errors.
>
>
> Does Wikidata have a quality goal or error rate threshold?  For example,
> Freebase has a nominal quality goal of 99% accuracy and this is the metric
> that new data loads are judged against (they also want to be in the 95%
> confidence interval, which determines how big a sample you need when doing
> evaluations).
>
> I haven't looked at this bot, but a develop/test/deploy cycle measured in
> hours seems, on the surface, to be very aggressive.
>
> Tom
>
> _______________________________________________ Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
> _______________________________________________
> Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to