The inner magic behind the Z3 theorem prover

Published

By , Partner Researcher , Senior Principal Researcher

Microsoft researchers Nikolaj Bjørner (left) and Leonardo de Moura (center) received the 2019 Herbrand Award for Distinguished Contributions to Automated Reasoning in recognition of their work in advancing theorem proving. They’re pictured with Jürgen Giesl (right) of the award committee.

(opens in new tab) Microsoft researchers Nikolaj Bjørner (left) and Leonardo de Moura (center) received the 2019 Herbrand Award for Distinguished Contributions to Automated Reasoning in recognition of their work in advancing theorem proving. They’re pictured with Professor Jürgen Giesl (right) of the award committee.

It’s not uncommon for us to hear that the Z3 theorem prover is magical (opens in new tab), but the frequency of such complimentary feedback doesn’t make it any less unexpected—or humbling. When we began work on Z3 (opens in new tab) in 2006, the design was motivated by two emerging use cases: program verification and dynamic symbolic execution. Research projects around program verification and dynamic symbolic execution, such as the verification-oriented programming language Dafny (opens in new tab), automatic test generation (opens in new tab), and fuzz testing (opens in new tab), had created an immense appetite for scalable and efficient solvers. While Z3, which is a satisfiability modulo theories (SMT) solver, was intentionally designed with a general interface that would allow easy incorporation into other types of software development and analysis tools, we couldn’t possibly have dreamed up the kind of uses we’ve seen (opens in new tab), from biological computation analysis (opens in new tab) to solving pebbling games for quantum computers (opens in new tab).

Z3 has more than 5,000 citations since 2008 (opens in new tab) and has received numerous honors, including the 2018 Test of Time Award (opens in new tab) from the European Joint Conferences on Theory & Practice of Software (ETAPS). In August, we were profoundly honored to receive the 2019 Herbrand Award for Distinguished Contributions to Automated Reasoning (opens in new tab) in recognition of our work in advancing theorem proving. We attribute the software verification and SMT communities’ embrace of Z3 to two valuable characteristics, its usability and scalability.

Spotlight: Blog post

Eureka: Evaluating and understanding progress in AI

How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings. 

Translating different language constructs into Z3 is easier than with theorem provers for pure first-order logic, as it offers one-to-one mapping from the data domain of the program being tested to the data domain of Z3, and it can handle larger, complex tasks often quickly. Saying it can significantly reduce wait time for results would be an understatement, as Microsoft Software Engineer Andrew Helwer (opens in new tab) discovered. In sharing his experience working with Z3 to update Microsoft Azure firewalls (opens in new tab), a task he described as the thorniest technical problem he had encountered, he noted the theorem prover and constraint solver—Z3 is capable of both checking whether a solution exists and, if so, providing one—guaranteed the new firewalls offered the same high levels of security and availability as the old in less than a second. By Helwer’s estimate, the same job would have taken a computer millions of years.

This efficiency and raw power to crunch numbers and formulas is a result of what we consider to be the key insight that has allowed us to engineer the tool in several dimensions: a model-based approach to SMT solving.

Model-based methods in Z3

SAT solving and first-order theorem proving have long been dominated by two methods, search and saturation. In search, you assign values one by one to variables and then backtrack when faced with contradictions. In saturation, you learn new facts from old facts. A paradigm shift occurred in SAT solvers when these two methods were combined using conflict-driven clause learning (opens in new tab). In a few steps, this new approach could eliminate entire classes of dead-ends that would require many more steps if they were enumerated explicitly, essentially making mistakes less costly. The development inspired us to explore executing the approach for infinite domains: How do you do it in the context of reasoning with integers, real numbers, or bit vectors?

Understanding how to execute model-based methods in SMT solving began for us with the realization that the model being built during search could be used to explore far fewer cases when implementing the Nelson-Oppen combination method (opens in new tab): To combine results from two solvers, it’s sufficient to reconcile only equalities that have to hold in the current candidate model. Consequently, the method is called model-based theory combination (opens in new tab). Instead of relying on elaborate mechanisms for extracting all possible equalities, the procedure asks the current candidate model for what equalities it assumes.

This proved to be just an initial indication of several additional ways to exploit models. When solving formulas over arrays, Z3 uses the current candidate model to limit the number of times definitions for arrays are expanded and exposed to the search process. In a setting that generalizes the domain of arrays, when instantiating quantified formulas, Z3 uses the current candidate model to search for instances where the model is repaired to satisfy the quantifier in a method known as model-based quantifier instantiation (opens in new tab). Model-based techniques are also used to completely eliminate quantifiers over integers and reals; solve Horn clauses in the SPACER procedure (opens in new tab); and reason about integers and Tarski’s logic of real numbers in standalone procedures. The idea even generalizes to a combination of other theories in the model-constructing satisfiability (mcSAT) procedure (opens in new tab). Every instance in which model-based techniques have been integrated in Z3 has led to increased speeds orders of magnitude greater than previous techniques. For Z3, model-based techniques are the difference-maker.

Even though the high-level idea of using models is relatively straightforward, the machinery and sophistication that goes into using models for guiding saturation is substantial. To integrate them well requires turning the failed search branches into strong, irredundant search inferences. There is a deep connection between the process of creating these inferences and deducing a special form of Craig interpolants (opens in new tab). An inference that originates from a good interpolant eliminates not only the particular bad choice, but a maximal set of potential bad choices. A crucial part of engineering this integration is to ensure the interpolants contain enough information from the partial assignment still in play to separate it from extensions that are infeasible. The smarter and more efficiently you can compute interpolants, the better control you have over the search space.

A robust SMT community

Carnegie Mellon University professor Jeremy Avigad (opens in new tab), in nominating Z3 for the SIGPLAN Programming Languages Software Award (opens in new tab) it received in 2015, attributed the tool’s seemingly “supernatural abilities” to “extensive empirical research, supported by careful thought, clever insight, and a deep theoretical understanding.” The award was presented to us and Christoph Wintersteiger (opens in new tab), from Microsoft Research Cambridge, who is a substantial contributor to Z3.

Employing model-based techniques is the clever insight, and it is the open standards created by the SMT community that has cultivated the empirical research essential to engineering quality tools. The Satisfiability Modulo Theories Library (SMT-LIB) initiative (opens in new tab) and the TPTP Problem Library for Automated Theorem Proving (opens in new tab) have without a doubt had a tremendous impact on advancing automated reasoning, allowing researchers to share benchmarks from their driving scenarios and their challenge problems.

In its first 10-plus years, Z3 has surpassed our initial vision as a tool for program verification and dynamic symbolic execution thanks to the SMT community, and we look forward to what we hope will be many more advances, such as significantly more powerful arithmetical reasoning by our colleague Lev Nachmanson (opens in new tab), and many more use cases over the next 10 years. For more information on Z3 and details on variations of the model-based methods we’ve taken with it, visit our publication page (opens in new tab).

Related publications

Continue reading

See all blog posts