We are proposing GLAMpipe for a Wikimedia grant right now
<https://meta.wikimedia.org/wiki/Grants:Project/GLAMpipe>. GLAMpipe is a
data import, manipulation and export tool. It can read a set of files,
APIs, tabular data etc., manipulate it (split, merge, format, make
lookups...) and export as files or to web services like Wikimedia Commons
We are applying for funding to create a Wikidata manipulation and export
"New nodes for transforming data for Wikidata will be created. The
transformation node needs to take into account the triplet data structure,
be able to map data to Wikidata and format appropriately. The Wikidata
export node takes transformed data hash as parameter, makes sanity checks
for it and saves it to Wikidata. The export node checks that there are no
duplicate values and merges statement to existing ones if possible. The
export node is made using a Widar-like OAuth interface and plans to share
The kind of matching described here is what we aim to cater for.
We'd be enormously happy if you decided to endorse the project!
2016-10-14 13:17 GMT+03:00 Sandra Fauconnier <sandra.fauconn...@gmail.com>:
> What you are encountering here, is a major bottleneck and timesuck for any
> data import into Wikidata. Matching external lists of concepts (names of
> people, places, buildings, whatever) from external datasets correctly with
> the right Wikidata items is a thing that always takes me hours and hours
> and hours of work.
> In order to solve it, we need a working and user-friendly reconciliation
> tool that is integrated into a common data management platform (i.e.
> OpenRefine, and would also be fantastic to have it for Google Spreadsheets).
> Magnus has developed a basic API for it
> <https://tools.wmflabs.org/wikidata-reconcile/>, but a working and
> user-friendly interface in one of those tools mentioned above is the
> missing link.
> I want to emphasize again that there is a bounty (money!) to be earned
> those who develop this for OpenRefine.
> I have outlined the task in Phabricator too. https://phabricator.
> Just putting this out here to give it attention again. It is such an
> important missing link in the workflow of anyone who wants to import data
> into Wikidata.
> I’m so desperate for it that I’m considering to collect funding and then
> hire an external developer to make it, but of course it would be best if it
> would be developed and maintained from within our community ;-)
> Greetings, Sandra
> On 13 Oct 2016, at 11:16, Markus Bärlocher <markus.baerloc...@lau-net.de>
> Hi Tom,
> This is a lighthouse case for my Google Sheets add-on
> Great tool - thanks!
> And more great tools included there :-)
> just add new terms to the "Terms" column, everything else fills
> I checked the first results by hand:
> 30% of the found WP-articles are specific helpful
> 70% of the URLs lead to not concordant content
> My idea:
> A "reliability index" may be could help?
> (1. handy approved accordance of Term and WP-article)
> 2. Term and Lemma identical
> 3. Term and section title identical
> 4. all words in Term found in Lemma
> 5. all words in Term found in section title
> 6. Term found as string in article text
> But I have no idea how to do this myself:
> Best regards,
> Wikidata mailing list
> Wikidata mailing list
Wikidata mailing list