Improve scalability with system page latch concurrency enhancements in SQL Server 2022

July 21, 2022

Part of the SQL Server 2022 blog series.

Over the past several SQL Server releases, Microsoft has improved the concurrency and performance of the tempdb database. In SQL Server 2022 we are addressing one of the last areas of contention by introducing concurrent global allocation map (GAM) and shared global allocation map (SGAM) updates which will give SQL Server 2022 a big improvement for scalability as tempdb is arguably the most important database in your environment.

Tempdb performance challenges

Historically, tempdb has been one of those common pain points in SQL Server. Why was it a pain point? Well, usage is one of the key reasons. By usage, we are referring to creating temp tables and other user objects, but tempdb is also used internally to spill to disk when there isn’t enough memory available for a process or there is an inaccurate estimate that causes SQL Server to spill to tempdb.

What is the tempdb database?

Tempdb is a special purpose system database, but the structure is essentially just like any other user database. As the name suggests, the tempdb database was designed for temporary storage meaning that nothing written to tempdb is intended to be persisted.
What is important to know is that while SQL Server uses the tempdb database for nearly every SQL Server workload, there is only one tempdb per SQL Server instance and tempdb is recreated every time SQL Server is restarted.

Tempdb workloads

The main difference between tempdb and other databases is the workload. With tempdb we are constantly creating and destroying objects such as temp tables. This is especially true in heavy OLTP environments where you may have many threads doing all kinds of work and if you are already seeing resource contention on the system the impact will be amplified.

So, what is stored in tempdb?

Of course, temp tables and table variables will go into tempdb – this is usually the first object type we think about. So, whenever you create a temporary table in a stored procedure or in a regular batch, that is going to go to tempdb.

Row versions go into tempdb, if you are using snapshot isolation, read committed snapshot where every time a row is modified by a specific transaction, the database engine will store a version of the previously committed image of the row in tempdb.

Hash operations will spill to tempdb. Worktables are also used for spools, cursors, sorts, and temporary large object (LOB) storage – this will all go to tempdb.

Triggers use the row version store in tempdb—this will go to tempdb.

Online index operations—if you are maintaining your indexes with the ONLINE ON keyword then we are creating temporary shadow copies in tempdb.

DBCC CHECKDB also creates shadow copies in tempdb

As you can see, there’s a lot that goes into tempdb—tempdb will be used for user objects such as global or local temporary tables and indexes, stored procedures, table variables, table-valued functions, and cursors. But we also use tempdb for internal scenarios—such as spills to disk and as worktables that store intermediate results for spools and sorts.

What causes contention in tempdb?

Since tempdb is used for so many different scenarios, there is only one tempdb database per SQL Server instance, and we have started pushing towards bigger machines with larger workloads, we have started seeing concurrency issues emerge in the tempdb space in three key areas:

Object allocation contention
Metadata contention
Temp table cache contention

What is object allocation contention?

Again, tempdb is structured just like any other database, but remember—the workloads are different in tempdb, so object allocation contention matters more because of the constant creation and destruction of objects.

On a server that is experiencing object allocation contention, you may notice severe blocking especially when the server is experiencing a heavy load. As a result, SQL Server workloads will be slowed, but the server’s CPU may appear to be underutilized. This is because the contention resides in the system metadata. To help avoid these areas of contention, a number of long-standing best practices have been recommended by SQL Server support teams.

For tempdb, one of the key best practices has always been to create multiple primary data files (mdf) at the same size and same growth rate. The reason for this was to help alleviate the object allocation contention by distributing tempdb activity across multiple partitions.

SQL Server databases must have a primary data file that uses the .mdf file extension by default and a log data file that uses the .ldf file extension. The primary tempdb data file, tempdb.mdf has key pages that track how objects are allocated in SQL Server. As we see in the image below, the table under the tempdb.mdf title represents pages.

Page types used in the data files of the tempdb database.

Page 0 is a header page, and this is true for any primary or secondary data files in SQL Server.

Page 1 is what is called a page free space (PFS) page which is used any time SQL needs to allocate space to an object. Basically, the PFS page contains info on how full the pages are, for the next 8088 pages, in the database. If SQL Server needs to add some data, SQL Server uses the PFS page to see how full the associated object is to see where the data can fit.

After 8088 pages, there is another PFS page in the same data file – it repeats itself. So, you will have more than one PFS page depending on how large the file is.

Page 2 is always the global allocation map (GAM), and this is where the extent allocation comes in as the GAM tracks when SQL Server needs to allocate a uniform extent.

An extent in SQL Server is comprised of 8 x 8KB pages, 64KB, and this is usually the unit of data allocation so if you have a table that’s larger than eight pages, any time we allocate space to that table we will create a full extent, and this is a uniform extent because all eight pages in that extent belong to that object.

So, any time SQL needs to allocate a uniform extent to an object SQL will go to the GAM page and check the availability. The GAM is a bitmap, so if the bit is 1 then that extent is available to be allocated, and if it is 0 then it is not available to be allocated. Once SQL Server has allocated the extent, it just flips the bit for that GAM page from 1 to 0 to show that it is no longer available.

The larger the file, the more GAM Pages you will have, for the GAM you will get another GAM page after 63,904 extents.

Page 3 is for the shared global allocation map (SGAM) and this page is used if SQL needs to allocate space on a mixed extent. This SGAM page tracks mixed extent usage for when an extent is being used by more than one object. So, if I have an object that is less than eight pages and I don’t want to allocate a full extent, we will use a mixed extent. By default, when a brand-new object is created the first eight pages will be allocated on a mixed extent.

The SGAM is a bitmap, so if the bit is 1 then it’s being used as a mixed extent and has space available to be allocated, then we would look for the corresponding PFS page to find the empty pages within that extent, and then we would allocate that page. The important point here is the SGAM is used in conjunction with the PFS page in order to allocate space on a mixed extent and again after about 64,000 extents you get another SGAM on the same file.

When you have multiple files in tempdb, you get another immediate header, PFS, GAM, and SGAM because all files will have the same structure and with multiple files we try and share the workload over these files.

SQL Server spreads out object allocations across files in the same filegroup based on the proportional fill algorithm. We try to keep the same percentage of free space in each file within the file group so if all the files in the file group start at the same size and they stay the same size and they have the same amount of free space then we turn that proportional fill algorithm into a round robin algorithm—each subsequent allocation will hit the next file and so on—and that was the reason we recommend having multiple files at the same size—spreading object allocations across all the files allows you to get around this object allocation bottleneck. That recommendation came out in SQL Server 2000, and it is still true in SQL Server 2019.

Multiple files equally sized is our best practice and this will stand until testing proves otherwise.

Tracking object allocation contention

Prior to SQL Server 2019, the best approach was to monitor the sys.dm_os_waiting_tasks dynamic management view and log the contention history over time.

Consider the SQL Server statement listed below:

When looking at the wait resource, you can monitor the contention knowing that the first number refers to the database, the second number is the file id, and the last number is the page type.

This means that contention on wait resource 2:7:2 is tempdb contention as tempdb database is always database id 2, there is contention on file id #7 with GAM contention as the figure illustrates (page #1 is the PFS, #2 is the GAM, and #3 is SGAM).

These wait resource references are commonly in the format 2:1:1, 2:1:3, and so on.

Any results found on database id 2 indicate that there are requests waiting for tempdb resources and the accumulation of these requests can help database administrators narrow down the root cause of the contention.

In SQL Server 2019 we created new functions to improve tempdb troubleshooting. The sys.fn_PageResCracker dynamic management function returns the db_id, file_id, and page_id for the given page_resource value and sys.dm_db_page_info dynamic management function returns page information like page_id, file_id, index_id, object_id, and more that are present in a page header. This information is useful for troubleshooting and debugging various performance (lock and latch contention) and corruption issues.

The example query below can be used to better resolve wait resource information for any SQL Server release post SQL Server 2019:

What is metadata contention?

The other main type of contention is called metadata contention. This type of contention is not about I/O. This contention occurs in memory when multiple threads are trying to modify the same page in memory at the same time.

You can track metadata contention using the same methods you would use to track object allocation contention, the difference is instead of the wait resource being 2:1:1, 2:1:2, 2:1:3 on the PFS, GAM, and SGAM, you are more likely to see the contention occurring on index and data pages and the page number in the wait resource will be a higher value such as 2:1:111, 2:1:118, or 2:1:122, for example.

For metadata contention it is useful to make note of page numbers greater than single digits, track the object name, and the page type description. The object names will show as system tables such as sysallocunits, syscolpars, sysjobactivity, sysscalartypes, sysschobjs, and so on.

Metadata contention was addressed in SQL Server 2019 with the memory-optimized tempdb metadata improvement.

Memory-optimized metadata tables for tempdb is basically a combination of the in-memory OLTP and the temp table metadata features. We took the system tables and the tempdb system tables—and we moved those into non-durable memory-optimized tables.

Remember, that tempdb is temporary—it gets dropped and recreated every time you restart SQL Server so there was no reason for the metadata to be durable. We converted the 12 system tables that are involved in object tracking and tempdb into memory optimized non-durable tables.

We don’t need a specialized memory-optimized file group for tempdb since it is non-durable anyway. All of it is “in memory”—no disk is needed and with memory-optimized tables there’s no latching and no locking. We can massively increase the concurrency against these metadata tables using these lock-free, latch-free data structures.

Enabling memory-optimized metadata tables does require a restart since we must configure the Hekaton DLLs. This was a big improvement in SQL Server 2019 that will eliminate a lot of the metadata contention.

Learn more about how this improvement removes the metadata contention bottleneck for tempdb-heavy workloads in our memory-optimized tempdb documentation.

Temp table cache contention

In past SQL Server releases starting with SQL Server 2005, we introduced temp table caching to get around some of the metadata caching contention. Basically, when you cache a temp table object—when you delete that table SQL Server doesn’t actually drop the metadata—we keep a cache of all the temporary objects that are used through a stored procedure and then we reuse the metadata for those objects when you call the same stored procedure with the same temp table again.

As a result, temp table caching has fewer hits to the metadata and alleviates some of that metadata contention—but not completely.

Temp table caching helped address metadata contention by allowing us to reuse tables that didn’t change between stored procedure executions. As long as the table was not altered after it was created, it would be eligible to be reused by another execution of the same stored procedure. However, if the table is altered (by adding an index or a column, for example), then it can’t be reused and must be dropped when the stored procedure completes.

There are several different tables that we need to delete metadata from in order to completely drop the table, and this was all being done synchronously at the end of the stored procedure execution. Additionally, every time a new feature is added to SQL Server (ColumnStore indexes, temporal tables, In Memory OLTP, etc.) all these new features require new metadata to be tracked, therefore the number of system tables we need to delete from is increasing, which makes the process more impactful.

Temp table cache contention can be more prominent on larger SQL Server environments, larger core counts and as the size of the cache and the number of concurrent threads accessing the cache grows, this can introduce slower cache access as well as contention for the memory object associated with the cache.

This condition can manifest in two different ways: CMEMTHREAD waits and SOS_CACHESTORE spinlock waits. To address temp table cache contention, it is recommended to track these wait conditions for evidence and ensure you have installed the latest cumulative updates (CUs) for SQL Server.

Check out our Tech Community blog for more information on temp table caching in SQL Server.

SQL Server 2022 tempdb improvements

In SQL Server 2022 we have addressed the final common areas of contention by introducing concurrent GAM and SGAM updates similar to the Concurrent PFS updates.

We use the Global Allocation Map (GAM) pages when we are looking for uniform extents and the Shared Global Allocation Map (SGAM) pages when we are looking for mixed extents in tempdb.

In previous releases, under higher concurrent workloads we may have GAM contention where many different threads attempt to allocate extents on the same GAM page and each thread must first wait for another thread to release their UPDATE latch before they can obtain their own latch to allow them to make changes—so, we are just waiting in line.

As can be seen in the workload example below, on SQL Server 2019 there is a wall of GAM contention driving over 123,000 counts of contention with the longest wait taking 949 milliseconds.

Dashboard showing the performance of SQL Server 2019 and 123,000 latch contentions over the last five minutes.

The reason for this is that with the update latch, only one thread can modify the GAM page at a time, leading to contention. This is the primary reason we still need multiple data files and because of this contention, SQL Server throughput is decreased and workloads that require many updates to the GAM page will take longer to complete while the machine’s CPU will be underutilized. This contention is due to the workload volume and especially the use of repetitive create-and-drop operations.

Starting with SQL Server 2016, we changed the default behavior to always allocate uniform extents when creating new objects in tempdb. This helped avoid most of the SGAM contention, but we still use mixed extents for Index Allocation Map (IAM) page allocations. IAM pages are used to track all the pages that belong to an object, so every object that gets created has at least one IAM page. For most workloads, these IAM page allocations don’t cause any issues, but for extremely busy tempdb workloads with many threads of concurrent allocations, these IAM page allocations can still cause SGAM contention.

SQL Server 2022 addresses GAM and SGAM contention

SQL Server tempdb contention is near completely addressed in SQL Server 2022 and these benefits are on by default. With these improvements in SQL Server 2022 we allow concurrent updates to the GAM and SGAM under a shared latch rather than using the update latch. This improvement erases nearly all tempdb contention allowing parallel threads to be able to modify the GAM and SGAM pages as can be seen in the example below.

Dashboard showing a similar workload run on SQL Server 2022 with only 607 points of contention over the same period of time.

In the SQL Server 2022 workload example shown here, we only have 607 points of contention over the same time period compared to SQL Server 2019 with the longest wait at only 342 milliseconds. The only contention in the environment was metadata contention in this example because we did not enable the SQL Server optimized tempdb metadata improvement.

There are still possible points of metadata contention, but with SQL Server 2022, the points of contention will be rare and should not lead to any significant performance challenges.

If concurrent GAM and concurrent SGAM updates are some of the last areas of contention, do we still need the best practices to maintain multiple data files for tempdb?

Out of the gate, we are going to continue recommending the same best practices, but we may adjust if we find that it is no longer required through customer feedback.

Summary

In SQL Server 2022 we have improved tempdb performance to a factor that may need to revise the tempdb best practices that have stood true for nearly a quarter of a century.

We have greatly improved the performance of tempdb. So much of what runs on SQL Server relies on tempdb, these enhancements will likely be more than enough to make SQL Server 2022 a mandatory upgrade in most organizations.

The key point is that it’s important for DBAs to optimize tempdb performance, it’s important to track and address tempdb bottlenecks, and SQL Server has improved tempdb in every single release—SQL Server 2022 is no exception.

Next steps

System page latch concurrency enhancements in SQL Server 2022 are just one of the many benefits of migrating to SQL Server 2022.

Download the latest release of SQL Server 2022 if you haven’t already done so and check out the SQL Server 2022 Overview and What’s New references. There are many new features and improved functionality being added to this release.