http://tabula.nerdpower.org/ Tabula: Turn tables within PDFs into CSVs.
More information at http://source.mozillaopennews.org/en-US/code/tabula/ .

I imagine there are some people on this list who have access to PDFs of
openly licensed data they'd like to get into Wikidata (from corporate or
government sources who don't provide easy-to-work-with dumps or APIs).
I heard about Tabula last night and thought the following flow sounded
plausible:

1) get PDFs
2) run them through Tabula to get CSVs
3) use a pywikipediabot script to upload rows to Wikidata

Happy adding!
-- 
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation

_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to