Towards General-Purpose Resource Management in Shared Cloud

HotDep'14: Proceedings of the 10th USENIX conference on Hot Topics in System Dependability |

Published by USENIX Association

In distributed services shared by multiple tenants, managing
resource allocation is an important pre-requisite to
providing dependability and quality of service guarantees.
Many systems deployed today experience contention, slowdown,
and even system outages due to aggressive tenants
and a lack of resource management. Improperly throttled
background tasks, such as data replication, can overwhelm
a system; conversely, high-priority background tasks, such
as heartbeats, can be subject to resource starvation. In this
paper, we outline ve design principles necessary for ešective
and e›cient resource management policies that could
provide guaranteed performance, fairness, or isolation.We
present Retro, a resource instrumentation framework that
is guided by these principles. Retro instruments all system
resources and exposes detailed, real-time statistics of pertenant
resource consumption, and could serve as a base
for the implementation of such policies.