Hi Ryan, pages-meta-history hasn't been generated for enwiki in a while (it's gotten too big), so I can't tell you anything about it. We're importing pages-articles.xml (currently about 20 GB, 5 GB as bzip2) using mwdumper. We're using MyISAM, not InnoDB. The import takes about 8 hours, most of it (80%) for creating the indexes.
Besides pages-articles.xml, we also import categorylinks.sql, imagelinks.sql, image.sql, langlinks.sql and templatelinks.sql. The MySQL database filled from all these files takes up 39 GB hard drive space. The largest file is text.MYD - about 20 GB. With the indexes defined in tables.sql, query performance is ok. For example, selecting the titles of all articles that are not redirects takes five or ten minutes (didn't profile it exactly). Hope that helps. Christopher On Fri, Nov 20, 2009 at 14:13, Ryan Chan <[email protected]> wrote: > Hello, > > Anyone has experience in importing enwiki database dump at > http://download.wikimedia.org/backup-index.html into a real MySQL > server? > > 1. It seems pages-meta-history has the max. size in term of download, > how much storage space does it take when imported into a table? > (including index) > 2. What are the total storage needed for importing the whole enwiki? > 3. Do you experience performance problem when querying the database, > seems I think most table if over 10GB size? Any suggestion? > > > I need this imformation as I require to prepare budget plan to buy a > proper server for doing the job. > > Thank you. > > Ryan > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
