George Herbert wrote: > This discussion brings to mind several historical threads. > > I wonder if a project to simply mine the whole article contents and > provide a DB of some sort with the articles and infobox contents would > be worthwhile. Develop a specific parser and generate and publish the > complete set of article-infobox-(key-value) sets... >
I don't know anybody on the data side at Metaweb anymore, but I know that they did something like that to import a lot of structured Wikipedia data into their Freebase project. They publish some sort of data dump here: http://download.freebase.com/wex/ Perhaps they'd be willing to open-source their parser. William _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
