In case anyone missed this.... This paper outline a new in memory approach to running distributed map/reduce jobs:
http://www.usenix.org/event/osdi10/tech/full_papers/Power.pdf Definitely some interesting optimizations going on in there (like the use of partitioned tables) that might be relevant when setting up "big data" infrastructure for mining WMF data. Worth a read if you are into distributed computing. -P- -- Peter Adams <[email protected]> Open Web Analytics http://www.openwebanalytics.com/ _______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
