Hi Felipe,Thanks for the great effort. This will save us hours of downloading and importing older dumps.
bilal On Tue, Jun 23, 2009 at 12:26 PM, Felipe Ortega <[email protected]>wrote: > > Hello. > > Since just a few hours ago, a new public repository has been created to > host WikiXRay database dumps, containing info extracted from public > Wikipedia dbdumps. The image is hosted by RedIRIS (in short, the Spanish > equivalent of Kennisnet in Netherlands). > > http://sunsite.rediris.es/mirror/WKP_research > > ftp://ftp.rediris.es/mirror/WKP_research > > These new dumps are aimed to save time and effort to other researchers, > since they won't need to parse the complete XML dumps to extract all > relevant activity metadata. We used mysqldump to create the dumps from our > databases.. > > As of today, only some of the biggest Wikipedias are available. However, > in the following days the full set of available languages will be ready for > downloading. The files will be updated regularly. > > The procedure is as follows: > > 1. Find the research dump of your interest. Download and decompress it in > your local system. > > 2. Create a local DB to import the information. > > 3. Load the dump file, using a MySQL user with insert privileges: > > $> mysql -u user -p passw myDB < dumpfile.sql > > And you're done. > > Final warning. 3 fields in the revision table are not reliable yet: > > rev_num_inlinks > rev_num_outlinks > rev_num_trans > > All remaining fields/values are trustable (in particular rev_len, > rev_num_words, and so forth). > > Regards, > > Felipe. > > > > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
