Towards Building Autonomous Data Services on Azure
- Yiwen Zhu ,
- Yuanyuan Tian ,
- Joyce Cahoon ,
- Subru Krishnan ,
- Ankita Agarwal ,
- Rana Alotaibi ,
- Jesús Camacho-Rodríguez ,
- Bibin A Chundatt ,
- Andrew Chung ,
- Niharika Dutta ,
- Andrew Fogarty ,
- Anja Gruenheid ,
- Brandon Haynes ,
- Matteo Interlandi ,
- Minu Iyer ,
- Nick Jurgens ,
- Sumeet Khushalani ,
- Brian Kroth ,
- Manoj Kumar ,
- Jyoti Leeka ,
- Sergiy Matusevych ,
- Minni Mittal ,
- Andreas C. Müller ,
- Kartheek Muthyala ,
- Harsha Nagulapalli ,
- Yoonjae Park ,
- Hiren Patel ,
- Anna Pavlenko ,
- Olga Poppe ,
- Santhosh Ravindran ,
- Karla Saur ,
- Rathijit Sen ,
- Steve Suh ,
- Arijit Tarafdar ,
- Kunal Waghray ,
- Demin Wang ,
- Carlo Curino ,
- Raghu Ramakrishnan
Modern cloud has turned data services into easily accessible commodities. With just a few clicks, users are now able to access a catalog of data processing systems for a wide range of tasks. How- ever, the cloud brings in both complexity and opportunity. While cloud users can quickly start an application by using various data services, it can be difficult to configure and optimize these services to gain the most value from them. For cloud providers, managing every aspect of an ever-increasing set of data services, while meeting customer SLAs and minimizing operational cost is becoming more challenging. Cloud technology enables the collection of significant amounts of workload traces and system telemetry. With the progress in data science (DS) and machine learning (ML), it is feasible and desirable to utilize a data-driven, ML-based approach to automate various aspects of data services, resulting in the creation of autonomous data services. This paper presents our perspectives and insights on creating autonomous data services on Azure. It also covers the future endeavors we plan to undertake and unresolved issues that still need attention.