Aggify header: database methods balanced on a scale

Aggify

Loops that iterate over SQL query results are quite common, both in application programs that run outside the DBMS, as well as User Defined Functions (UDFs) and stored procedures that run within the DBMS. It can be argued that set-oriented operations are more efficient and should be preferred over iteration; but from real world use cases, it is clear that loops over query results are inevitable in many situations, and are preferred by many users. Such loops, known as cursor loops, come with huge trade-offs and overheads with regard to performance, resource consumption and concurrency.

Aggify is a technique for optimizing loops over query results that overcomes these overheads. It achieves this by automatically generating custom aggregates that are equivalent in semantics to the loop. Thereby, Aggify completely eliminates the loop by rewriting the query to use this generated aggregate. This technique has several advantages such as:

  • Pipelining of entire cursor loop operations instead of materialization
  • Pushing down loop computation from the application layer into the DBMS, closer to the data
  • Leveraging existing work on optimization of aggregate functions, resulting in efficient query plans

Aggify integrates seamlessly with Froid thereby enabling Froid-style inlining for UDFs with cursor loops.

Talk on Aggify at ACM SIGMOD 2020