MapReduce has become one of two dominant paradigms in distributed computing (along with MPI). Yet many times, implementing an algorithm as a MapReduce job - especially in Python - forces us to sacrifice efficiency (BLAS routines, etc.) in favor of data parallelism.
In my work, which involves writing distributed learning algorithms for processing terabytes of Twitter data at SocialFlow, I've come to advocate a form of "vectorized MapReduce" which integrates efficient numerical libraries like numpy/scipy into the MapReduce setting, yielding both faster per-machine performance and reduced I/O, which is often a major bottleneck. I'll also highlight some features of Disco (a Python/Erlang MapReduce implementation from Nokia) which make it a very compelling choice for writing scientific MapReduce jobs in Python.