CRA: A Common Runtime for Applications

MSR-TR-2019-2 |

Published by Microsoft

https://github.com/Microsoft/CRA

Today, a modern data center hosts a wide variety of applications comprising batch, interactive, machine learning, and streaming applications. A large majority of these applications can be abstracted as a “distributed dataflow” graph. Even though this commonality exists and can be exploited in the applications’ design and implementation, each application is typically written from scratch, resulting in significant inefficiencies in developer productivity. In this paper, we factor out the commonalities for many types of big data applications, into a generic dataflow layer called Common Runtime for Applications (CRA). In parallel, another trend, with containerization technologies, has taken a serious hold on cloud-scale data centers, with direct implications on the design, implementation, and deployment of next-generation of data-center application. Container engines (e.g., Docker and CoreOS) and cloud-scale container orchestrators (e.g., Kubernetes and Docker Swarm) are two important technologies that enable this trend. Container orchestrators have made deployment a lot easy, and they solve many infrastructure level problems, e.g., service discovery, auto-restart, and replication. For best in class performance, there is a need to marry the next generation applications with containerization technologies. To that end, CRA leverages and builds upon containerization and resource orchestration capabilities
of Kubernetes/Docker, and makes it easy to build a wide range of cloud-edge applications on top. To the best of our knowledge, we are the first to present a cloud native runtime for building data center applications. To show the practicality of our approach, we built a distributed analytics engine on top of CRA, namely Quill. We show through in-depth micro- and macro-benchmark results, that CRA provides significant performance improvement over an unoptimized implementation on modern cloud platforms. CRA is available as open source, and can be downloaded at https://github.com/Microsoft/CRA.