Disk-Locality in Datacenter Computing Considered Irrelevant
- Ganesh Ananthanarayanan ,
- Ali Ghodsi ,
- Scott Shenker ,
- Ion Stoica
USENIX HotOS |
Data center computing is becoming pervasive in many organizations. Computing frameworks such as MapReduce [17], Hadoop [6] and Dryad [25], split jobs into small tasks that are run on the cluster’s compute nodes. Through these frameworks, computation can be performed on large datasets in a fault-tolerant way, while hiding the complexities of the distributed nature of the cluster. For these reasons, a considerable work has been done to improve the efficiency of these frameworks