Probabilistic Inference Using Program Analysis

Though probabilistic model checkers such as PRISM [10] have been built to analyze a variety of probabilistic models such as discrete and continuous Markov chains, they primarily check for conformance of satisfaction of probabilistic temporal logic formulas (such as PCTL or PLTL) by these models, which limits their applications to the formal methods community, and in analyzing system reliability and performance. In contrast, the machine learning community at large has focused on computing quantities such as maximum likelihood estimates or posterior probabilities for models, and such inferences have applications in a variety of domains such as information retrieval, speech recognition, computer vision, coding theory and biology. In this paper, we argue that static and dynamic program analysis techniques can indeed be used to infer such quantities of interest to the machine learning community, thereby providing a new and interesting domain of application for program analysis.

Probabilistic models, particularly those with causal dependencies, can be succinctly written as probabilistic programs. Recent years have seen a proliferation of languages for writing such probabilistic programs, as well as tools and techniques for performing inference over these programs [5, 6, 8, 9, 11, 13]. Inference approaches can be broadly classified as static or dynamic —static approaches compile the probabilistic program to a probabilistic model such as a graphical model and then perform inference over the graphical model [8, 9, 11] exploiting its structure. On the other hand, dynamic approaches work by running the program several times using sampling to generate values, and perform inference by computing statistics over the results of several such runs [6, 13]. We believe that ideas from the areas of static and dynamic analysis of programs can be profitably used for the purpose of inference over probabilistic programs, and hence can find applications in inference for machine learning.