FPGA for Aggregate Processing: The Good, The Bad, and The Ugly

Zubeyr F. Eryilmaz; Aarati Kakaraparthy; Jignesh M. Patel; Rathijit Sen; Kwanghyun Park

FPGA for Aggregate Processing: The Good, The Bad, and The Ugly

Zubeyr F. Eryilmaz ,
Aarati Kakaraparthy ,
Jignesh M. Patel ,
Rathijit Sen ,
Kwanghyun Park

International Conference on Data Engineering (ICDE) | April 2021

Published by IEEE

Download BibTex

In this paper, we focus on current CPU-FPGA architectures and study their usability for database management systems. To focus our scope, we choose aggregation as the query processing primitive for this investigation. We implement a fully pipelined stall-free module that performs aggregation on the FPGA, and also describe a performance model that predicts the runtime of this module with 99% accuracy. We study the performance of this module on two different CPU-FPGA architectures, namely remote-main-memory and bump-in-the-wire. Compared to an implementation of aggregation on CPU, we find that the former is 1.7× slower whereas the latter is 2.2× faster. This significant performance gap suggests two important architectural considerations when designing CPU-FPGA systems, namely the bandwidth ceiling and the resource ceiling, while also highlighting issues of switching times and programmer efficiency. We consider broader hardware trends to study the suitability of the two FPGA architectures for accelerating the aggregation operation, and find that the performance gap is likely to stay in the coming future. Based on these observations, we discuss some challenges and opportunities for CPU-FPGA architectures.