cc-ed xmldatadumps-l Hi,
2012/10/23 Dario Taraborelli <[email protected]>: > 2012/10/23 James Forrester <[email protected]>: >> On 22 October 2012 16:03, Hydriz Wikipedia <[email protected]> wrote: >>> I have long been wanting to say this, but is it possible for the team behind >>> compiling such datasets to put future (and if possible, current) datasets >>> into dumps.wikimedia.org so that it is easier for everyone to find stuff and >>> not be all over the place? Thanks for that! >> >> Many one-off and regular datasets, from query results to data dumps >> and similar, are now indexed[0] on The Data Hub (formerly CKAN) run by >> the Open Knowledge Foundation for precisely this reason - so that data >> researchers can easily find data about Wikimedia, and see when it's >> updated. >> >> [0] - http://thedatahub.org/en/group/wikimedia > > The dumps server was never meant to become a permanent open data repository, > but it started being used as an ad-hoc solution to host all sort of datasets > published by WMF on top of the actual XML dumps: that's the problem we're > trying to fix. > > Regardless of where the data is physically hosted, your go-to point to > discover WMF datasets from now on is the DataHub. Think of it as a data > registry: the registry is all you need to know in order to find where the > data is hosted and to extract the appropriate metadata/documentation. That's fine for me but I think more communication about this would be welcome. I've added a link to meta:Data_dumps¹ and I'll communicate about this on the French Wikipedia, but a link on the dumps' page for other downloads² would be great. Most people I've helped to find data on the Wikimedia projects now know about dumps.wikimedia.org, but AFAIK none of them is reading wiki-research-l. Best regards, ¹ https://meta.wikimedia.org/wiki/Data_dumps ² http://dumps.wikimedia.org/other/ -- Jérémie _______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
