Dear Alek, dear list, exactly for this use case DBpedia (http://dbpedia.org ) was created, so you can query Wikipedia like a database. DBpedia already does the rollout to a "semantic engine", which you can query. Below I drafted some queries. These will give you all the Persons in Wikipedia that have a "deathDate". Totally there are 187739, which should be the most complete list you will find. Then the queries is refined to all persons, which died in 1941 (yielding 1318 persons), then all artists that died in 1941 and then all artists and their works!
Note that there is a static database which uses the latest dump: http://dbpedia.org/sparql http://dbpedia.org/snorql as well as a live version, which is synchronized directly (each edit is loaded into the engine) http://live.dbpedia.org/sparql Also for some of the other Wikipedias besides the English one, language specific versions exist: Polish: http://pl.dbpedia.org/ German: http://de.dbpedia.org Greek: http://el.dbpedia.org DBpedia has quite a large community, I would estimate that over 1000 volunteers from the area of computer science and Semantic Web worked on or with it since 2006. (This does not account for industry partners or a like) . @Alek I drafted some queries for you. There are a total of 5 result formats to choose from. Maybe json, plain or html are the one you are looking for. Here is a link to some user interfaces: http://wiki.dbpedia.org/OnlineAccess http://wiki.dbpedia.org/Applications Feel free to improve the data directly in Wikipedia (and use the live endpoint 5 minutes later) or tailor the data how you like it at mappings wiki: http://mappings.dbpedia.org . Actuallly the information contained could help to clean up the infoboxes, which would also help the start of WikiData. Here is one hook though. The more precise the queries get, the worse recall will be, as minor errors in the data add up with each constraint. Hope I could help, Sebastian Queries below: ******************* A count of all persons that have a deathDate [1]: SELECT count (*) WHERE { ?person <http://dbpedia.org/ontology/deathDate> ?deathDate . ?person <http://xmlns.com/foaf/0.1/page> ?page . } All persons that died in 1941. Note that on http:dbpedia.org there is a given limit of 1000, so you need to use OFFSET: SELECT * WHERE { ?person <http://dbpedia.org/ontology/deathDate> ?deathDate . ?person <http://xmlns.com/foaf/0.1/page> ?page . FILTER(?deathDate >= "1941-01-01"^^xsd:date) FILTER(?deathDate <= "1942-01-01"^^xsd:date) } order by ?deathDate Limit 1000 OFFSET 0 SELECT * WHERE { ?person <http://dbpedia.org/ontology/deathDate> ?deathDate . ?person <http://xmlns.com/foaf/0.1/page> ?page . FILTER(?deathDate >= "1941-01-01"^^xsd:date) FILTER(?deathDate <= "1942-01-01"^^xsd:date) } order by ?deathDate Limit 1000 OFFSET 1000 All artists that died in 1941 [3] SELECT * WHERE { ?person <http://dbpedia.org/ontology/deathDate> ?deathDate . ?person <http://xmlns.com/foaf/0.1/page> ?page . ?person rdf:type <http://dbpedia.org/ontology/Artist> FILTER(?deathDate >= "1941-01-01"^^xsd:date) FILTER(?deathDate <= "1942-01-01"^^xsd:date) } All artists and their work[4]: SELECT * WHERE { ?person <http://dbpedia.org/ontology/deathDate> ?deathDate . ?person rdf:type <http://dbpedia.org/ontology/Artist> . ?person <http://xmlns.com/foaf/0.1/page> ?page . OPTIONAL { ?person ?works ?work . FILTER (?works in (<http://dbpedia.org/property/works>, <http://dbpedia.org/property/notableworks>, <http://dbpedia.org/property/writer>) ) } FILTER(?deathDate >= "1941-01-01"^^xsd:date) . FILTER(?deathDate <= "1942-01-01"^^xsd:date) . } [1] http://dbpedia.org/snorql/?query=SELECT+count+%28*%29+WHERE+{%0D%0A%3Fperson+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FdeathDate%3E+%3FdeathDate+.%0D%0A%3Fperson+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fpage%3E+%3Fpage+.%0D%0A} [2] http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Fperson+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FdeathDate%3E+%3FdeathDate+.%0D%0A%3Fperson+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fpage%3E+%3Fpage+.%0D%0AFILTER%28%3FdeathDate+%3E%3D+%221941-01-01%22^^xsd%3Adate%29+%0D%0AFILTER%28%3FdeathDate+%3C%3D+%221942-01-01%22^^xsd%3Adate%29+%0D%0A}+order+by+%3FdeathDate%0D%0ALimit+1000+%0D%0AOFFSET+0 [3] http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Fperson+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FdeathDate%3E+%3FdeathDate+.%0D%0A%3Fperson+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fpage%3E+%3Fpage+.%0D%0A%3Fperson+rdf%3Atype+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FArtist%3E+%0D%0AFILTER%28%3FdeathDate+%3E%3D+%221941-01-01%22^^xsd%3Adate%29+%0D%0AFILTER%28%3FdeathDate+%3C%3D+%221942-01-01%22^^xsd%3Adate%29+%0D%0A}+ [4] http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Fperson+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FdeathDate%3E+%3FdeathDate+.%0D%0A%3Fperson+rdf%3Atype+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FArtist%3E+.%0D%0A%3Fperson+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fpage%3E+%3Fpage+.%0D%0A%0D%0AOPTIONAL+{%0D%0A++%3Fperson+%3Fworks+%3Fwork+.%0D%0A++FILTER+%28%3Fworks+in+%28%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fworks%3E%2C+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fnotableworks%3E%2C+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwriter%3E%29+%29%0D%0A}%0D%0AFILTER%28%3FdeathDate+%3E%3D+%221941-01-01%22^^xsd%3Adate%29+.%0D%0AFILTER%28%3FdeathDate+%3C%3D+%221942-01-01%22^^xsd%3Adate%29+.%0D%0A}+%0D%0A On 12/23/2011 01:35 PM, Alek Tarkowski wrote: > Hello everyone, > > I've been until now a lurker on this list, let me introduce myself - I'm > a sociologist studying digital technologies, an activist (I run Creative > Commons Poland) and I run a digital think tank / NGO in Poland. > > I'm hoping someone on this list might be able to help me: I'm involved > in the celebrations of the Public Domain Day - on the 1st of January > each year works pass into the public domain of authors who've died 70 > years ago (at least in Poland, and in most countries, but it might > differ in some jurisdictions). > > I'm looking for a good way to determine, who died in 1941 - and thought > that Wikipedia will be a good place to find this out. I know there are > lists of people who died in a given year, but they are not complete. Is > there any way to automatically query Wikipedia for such information? I > know that it's to some extent structured, as this information is > provided in templates for biographical articles, but I don't know > whether there is any mechanism for querying? > > Any advice will be much appreciated. > > All the best, > > Alek > -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Projects:http://nlp2rdf.org ,http://dbpedia.org Homepage:http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group:http://aksw.org _______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
