A.M.B.R.O.S.I.A: Providing Performant Virtual Resiliency for Distributed Applications

MSR-TR-2018-40 |

Published by Microsoft

This paper describes the main ideas and research behind the open source Ambrosia platform for writing resilient distributed applications.

Related File

When writing today’s distributed programs, which frequently span both devices and cloud services, programmers are faced with complex decisions and coding tasks around coping with failure, especially when these distributed components are stateful. If their application can be cast as pure data processing, they benefit from the past 40-50 years of work from the database community, which has shown how declarative database systems can completely isolate the developer from the possibility of failure in a performant manner. Unfortunately, while there have been some attempts at bringing similar functionality into the more general distributed programming space, a compelling general-purpose system must handle non-determinism, be performant, support a variety of machine types with varying resiliency goals, and be language agnostic, allowing distributed components written in different languages to communicate. This paper introduces the first system, Ambrosia, to satisfy all these requirements. We coin the term “virtual resiliency”, analogous to virtual memory, for the platform feature which allows failure oblivious code to run in a failure resilient manner. We also introduce a programming construct, the “impulse”, which resiliently handles non-deterministic information originating from outside the resilient component. Of further interest to our community is the effective reapplication of much database performance optimization technology to make Ambrosia more performant than many of today’s non-resilient cloud solutions.

Publication Downloads

AMBROSIA

December 14, 2018

Ambrosia is a programming language independent approach for authoring and deploying highly robust distributed applications. Ambrosia dramatically lowers development and deployment costs and time to market by automatically providing recovery and high availability.