What you are encountering here, is a major bottleneck and timesuck for any data
import into Wikidata. Matching external lists of concepts (names of people,
places, buildings, whatever) from external datasets correctly with the right
Wikidata items is a thing that always takes me hours and hours and hours of
In order to solve it, we need a working and user-friendly reconciliation tool
that is integrated into a common data management platform (i.e. OpenRefine, and
would also be fantastic to have it for Google Spreadsheets).
Magnus has developed a basic API for it
<https://tools.wmflabs.org/wikidata-reconcile/>, but a working and
user-friendly interface in one of those tools mentioned above is the missing
I want to emphasize again that there is a bounty (money!) to be earned
for those who develop this for OpenRefine.
I have outlined the task in Phabricator too.
Just putting this out here to give it attention again. It is such an important
missing link in the workflow of anyone who wants to import data into Wikidata.
I’m so desperate for it that I’m considering to collect funding and then hire
an external developer to make it, but of course it would be best if it would be
developed and maintained from within our community ;-)
> On 13 Oct 2016, at 11:16, Markus Bärlocher <markus.baerloc...@lau-net.de>
> Hi Tom,
>> This is a lighthouse case for my Google Sheets add-on
> Great tool - thanks!
> And more great tools included there :-)
>> just add new terms to the "Terms" column, everything else fills
> I checked the first results by hand:
> 30% of the found WP-articles are specific helpful
> 70% of the URLs lead to not concordant content
> My idea:
> A "reliability index" may be could help?
> (1. handy approved accordance of Term and WP-article)
> 2. Term and Lemma identical
> 3. Term and section title identical
> 4. all words in Term found in Lemma
> 5. all words in Term found in section title
> 6. Term found as string in article text
> But I have no idea how to do this myself:
> Best regards,
> Wikidata mailing list
Wikidata mailing list