What you are encountering here, is a major bottleneck and timesuck for any data 
import into Wikidata. Matching external lists of concepts (names of people, 
places, buildings, whatever) from external datasets correctly with the right 
Wikidata items is a thing that always takes me hours and hours and hours of 
work.

In order to solve it, we need a working and user-friendly reconciliation tool 
that is integrated into a common data management platform (i.e. OpenRefine, and 
would also be fantastic to have it for Google Spreadsheets).

Magnus has developed a basic API for it 
<https://tools.wmflabs.org/wikidata-reconcile/>, but a working and 
user-friendly interface in one of those tools mentioned above is the missing 
link.

I want to emphasize again that there is a bounty (money!) to be earned 
<https://www.bountysource.com/issues/985941-implement-wikidata-reconciliation-was-freebase>
 for those who develop this for OpenRefine.

I have outlined the task in Phabricator too. 
https://phabricator.wikimedia.org/T146740 
<https://phabricator.wikimedia.org/T146740>

Just putting this out here to give it attention again. It is such an important 
missing link in the workflow of anyone who wants to import data into Wikidata.
I’m so desperate for it that I’m considering to collect funding and then hire 
an external developer to make it, but of course it would be best if it would be 
developed and maintained from within our community ;-)

Greetings, Sandra

> On 13 Oct 2016, at 11:16, Markus Bärlocher <markus.baerloc...@lau-net.de> 
> wrote:
> 
> Hi Tom,
> 
>> This is a lighthouse case for my Google Sheets add-on 
> 
> Great tool - thanks!
> And more great tools included there :-)
> 
>> just add new terms to the "Terms" column, everything else fills 
>> automagically.
> 
> I checked the first results by hand:
> 30% of the found WP-articles are specific helpful
> 70% of the URLs lead to not concordant content
> 
> My idea:
> A "reliability index" may be could help?
> 
> (1. handy approved accordance of Term and WP-article)
> 2. Term and Lemma identical
> 3. Term and section title identical
> 4. all words in Term found in Lemma
> 5. all words in Term found in section title
> 6. Term found as string in article text
> 
> But I have no idea how to do this myself:
> https://github.com/tomayac/wikipedia-tools-for-google-spreadsheets/issues/11
> 
> Best regards,
> Markus
> 
> 
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to