Lots of ways to do this, but just wanted to throw another option out there. I've put a subset of Wikidata and Wikipedia from 2019 December into a Kaggle dataset and you can use the free Kaggle kernels to explore and try out algorithms. There is one file called "statements.csv" with integer triples for "qpq" statements (i.e., truthy statements where the source and the target are a WIkidata item).
* blog post explaining the dataset: https://blog.kensho.com/announcing-the-kensho-derived-wikimedia-dataset-5d1197d72bcf * kaggle page for the dataset: https://www.kaggle.com/kenshoresearch/kensho-derived-wikimedia-data * example kernel that uses a simple subclass path model to label items as person/location/organization: https://www.kaggle.com/gabrielaltay/kdwd-subclass-path-ner Also, if you are interested in working with the raw Wikidata JSON dumps using a python library you can check this package out, * https://qwikidata.readthedocs.io/en/stable/json_dump.html best, -Gabriel On Wed, Apr 29, 2020 at 7:43 PM Denny Vrandečić <[email protected]> wrote: > CONSTRUCT would be best, but I am not sure that there's any system to > allows you to do that. > > What I would do is get the truthy dump in ntriples, and filter out all > lines with the respective properties. The Wikidata Toolkit allows you to do > that and more. > > https://www.mediawiki.org/wiki/Wikidata_Toolkit > > On Mon, Apr 27, 2020 at 2:35 AM Ece Toprak <[email protected]> wrote: > >> Hi, >> >> I am currently working on a NER project at school and would like to know >> if there is a way to generate RDF dumps that only contain "instance of" or >> "subclass of" relations. >> I have found these dumps: >> RDF Exports from Wikidata >> <https://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html> >> Here, under "simplified and derived dumps" taxonomy and instances dumps >> are very useful for me but unfortunately very old. >> It would be great if I could generate up to date dumps. >> >> Thank You, >> Alkım Ece Toprak >> Bogazici University >> _______________________________________________ >> Wikidata mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
