The Fine-Grained Complexity of CFL Reachability

Principles of Programming Languages (POPL 2023) |

Many problems in static program analysis can be modeled as the context-free language (CFL) reachability problem on directed labeled graphs. The CFL reachability problem can be generally solved in time O(n^3), where n is the number of vertices in the graph, with some specific cases that can be solved faster. In this work, we ask the following question: given a specific CFL, what is the exact exponent in the monomial of the running time? In other words, for which cases do we have linear, quadratic or cubic algorithms, and are there problems with intermediate runtimes? This question is inspired by recent efforts to classify classic problems in terms of their exact polynomial complexity, known as {\em fine-grained complexity}. Although recent efforts have shown some conditional lower bounds (mostly for the class of combinatorial algorithms), a general picture of the fine-grained complexity landscape for CFL reachability is missing.

Our main contribution is lower bound results that pinpoint the exact running time of several classes of CFLs or specific CFLs under widely believed lower bound conjectures (e.g., Boolean Matrix Multiplication, $k$-Clique, APSP, 3SUM). We particularly focus on the family of Dyck-$k$ languages (which are strings with well-matched parentheses), a fundamental class of CFL reachability problems. One of our key results is a O(n^{2.5}) lower bound for Dyck-2 reachability, which to the best of our knowledge is the first super-quadratic lower bound that applies to all algorithms, and shows that CFL reachability is strictly harder that Boolean Matrix Multiplication. We also focus on lower bounds that use as input parameter the number of edges of the graph m, and hence also apply to sparse input graphs.