Unexpected cloud cost spike spurs optimization movement inside Microsoft

Paul Rojas looks at a whiteboard drawing of the Azure optimization recommendation platform.
Paul Rojas is a program manager on Microsoft Digital’s Azure Optimization Team. (Photo by Aleenah Ansari | Inside Track)

When Microsoft moved its internal workloads to Azure, operational costs shot up—something the team had to find a way to fix quickly.

The transition from physical datacenters to virtual cloud computing happens in waves. Microsoft’s journey began five years ago when the company’s Microsoft Digital organization deployed many of its internal computing resources to the cloud using Microsoft Azure.

The results were unexpected. The cost of doing business shot up, surprising employees like Paul Rojas, a program manager who now works on Microsoft Digital’s Azure Optimization Team, which didn’t exist at the time.

“If we were just doing a lift-and-shift, why were we seeing an increase in spending?” Rojas says. “We realized that if we were seeing these issues, our customers would be having them as well.”

The challenge with cloud computing is that once you flip the switch on, the meter keeps running until you switch it off. At Microsoft, which has a legacy of building its own always-on datacenters, employees were used to provisioning a server in a datacenter and forgetting about it.

“That can make Azure expensive,” Rojas says. “That was something we weren’t ready for.”

Microsoft employees were billed for their Azure use just like customers, so optimization became an external and internal priority. Optimization starts with building awareness of spending and usage down to the resource level. To do this, teams needed tools for metering and tracking current usage, but they also needed to change their mindset and prioritize workflow optimization.

It would take a few stops and starts before the powers-that-be inside of Microsoft realized that they’d need to build tools to track spending and find opportunities for cost optimization.

[Find out Microsoft is reducing its carbon footprint by tracking internal Microsoft Azure usage. Learn how to use Azure Cost Management to monitor Azure spending and optimize resource use.]

Optimizing for efficiency and cost savings

The Azure product group called on Microsoft Digital, which manages all of Microsoft’s internal workloads, to carry the flag of Azure optimization for Microsoft employees and customers. This led to the creation of the Azure Optimization Team. Their charter? Enable modern and cost-optimized cloud platforms for Microsoft Digital and Microsoft.

“When you design or build something, you want to make sure you’re not using a semi-truck to accomplish a task when you could use a small car,” Rojas says. “The same principle applies to your compute and storage resources in Azure.”

One of Rojas’s successful partnerships has been with Deepak Agrawal, a senior program manager who works on Azure Data Explorer (ADX), a cloud service for storing and running interactive analytics. Agrawal’s team responds to support tickets from Azure customers about optimization, and working with Rojas’s team presented the opportunity to provide recommendations on an even larger scale.

“This was our first effort to go from a reactive engagement to a proactive engagement,” Agrawal said.

The partnership began when Agrawal presented his strategy for Azure optimization at a cloud optimization conference.

“ADX is in the top three consumers of computing storage,” Agrawal says. “It was paramount for us to go in front of customers to talk about how they could also save money.”

Rojas also attended this conference and saw an opportunity for a partnership to optimize his own team’s use of ADX and share these recommendations to all customers.

“If you can widen the scope and provide recommendations to all Azure teams using ADX, that would have a bigger impact,” Agrawal says. “Paul wanted to bring insights from the conference into a more consumable stream that we could share externally.”

Currently, Agrawal’s team is partnering with Rojas’s team to generate Azure optimization recommendations for teams in Microsoft Digital based on current usage.

“We hope to provide actionable recommendations based on customer usage of the ADX platform,” Agrawal says.

Developing a partnership

When developing relationships with teams like ADX, Rojas starts by understanding a team’s goals.

“We want to create the understanding that we are there to help them achieve their goals or navigate obstacles they may have had in the past,” Rojas says. “Our goal is to have a symbiotic relationship.”

Then, Rojas’s team has a conversation with team members to highlight their wins in past projects and make a plan for the future. To come up with personalized recommendations for the team, Rojas requests access to a team’s data, which is used to measure everything from utilization to overall system health.

“The ADX team was more than happy to give us the data, and they wanted us to find instances of idle subscriptions that hadn’t been used,” Rojas says.

From there, developers on Rojas’s team used the data they were given to develop recommendations for adjusting the size of their instances or deleting unused software. Rojas’s team found that internal teams were spending $250,000 a month on idle subscriptions. The release of this recommendation led to $10,000 in cost savings. Rojas says that this could lead to hundreds of thousands of dollars in savings.

“We know that making users aware of optimization opportunities creates less demand on the system,” Rojas says. The money that’s saved “can be used for other avenues like training, headcount, or investment in high-priority items.”

Beyond cost savings, optimization recommendations also empower engineers to make informed decisions about how they use Azure based on usage and spending data. After creating optimization recommendations, Rojas’s team consulted with the ADX team to see if their suggestions for Azure optimization aligned with the team’s needs and expectations.

The long-term goal is to create a production-grade optimization recommendation engine that surfaces specific recommendations for optimizing Azure use. Agrawal hopes to share more than 20 recommendations for cost savings, best practices, and performance improvement on Azure Advisor and internally to Microsoft employees.

“It’s valuable to provide best practices in Azure Advisor by looking at the queries that have been executed on the data and provide specific recommendations,” Agrawal says. “This enables customers to better manage their configurations and optimize their use of the platform.”

“It’s a new exercise for us, and leveraging this platform, experience, and knowledge is a great way to go forward and still provide a good experience for customers,” Agrawal says.

Find out Microsoft is reducing its carbon footprint by tracking internal Microsoft Azure usage.

Learn how to use Azure Cost Management to monitor Azure spending and optimize resource use.

Recent