Concurrent Prefix Recovery: Performing CPR on a Database

2019 ACM SIGMOD Conference |

With increasing multi-core parallelism, modern databases and key-value stores are designed for scalability and presently yield very high throughput for the in-memory working set. These systems typically depend on group commit using a write-ahead log (WAL) to provide durability and crash recovery. However, a WAL is expensive, particularly for update-intensive workloads, where it also introduces a concurrency bottleneck (the log) besides log creation and I/O overheads. In this paper, we propose a new recovery model based on group commit, called concurrent prefix recovery (CPR). CPR differs from traditional group commit implementations in two ways: (1) it provides a semantic description of committed operations, of the form “all operations until time Ti from session i”; and (2) it uses asynchronous incremental checkpointing instead of a WAL to implement group commit in a scalable bottleneck-free manner. CPR provides the same consistency as a point-in-time commit, but allows a scalable concurrent implementation. We used CPR to make two systems durable: (1) a custom in-memory transactional database; and (2) FASTER, our state-of-the-art, scalable, larger-than-memory key-value store. Our detailed evaluation of these modified systems shows that CPR is highly scalable and supports concurrent performance reaching hundreds of millions of operations per second on a multi-core machine.