DoWhy: An End-to-End Library for Causal Inference

Causal Data Science Meeting (https://causalscience.org/)

In addition to efficient statistical estimators of a treatment’s effect, successful application of causal inference requires specifying assumptions about the mechanisms underlying observed data and testing whether they are valid, and to what extent. However, most libraries for causal inference focus only on the task of providing powerful statistical estimators. We describe DoWhy, an open-source Python library that is built with causal assumptions as its first-class citizens, based on the formal framework of causal graphs to specify and test causal assumptions. DoWhy presents an API for the four steps common to any causal analysis—1) modeling the data using a causal graph and structural assumptions, 2) identifying whether the desired effect is estimable under the causal model, 3) estimating the effect using statistical estimators, and finally 4) refuting the obtained estimate through robustness checks and sensitivity analyses. In particular, DoWhy implements a number of robustness checks including placebo tests, bootstrap tests, and tests for unoberved confounding. DoWhy is an extensible library that supports interoperability with other implementations, such as EconML and CausalML for the the estimation step.

GitHubGitHub

Publication Downloads

DoWhy: A library for causal inference

May 11, 2021

As computing systems are more frequently and more actively intervening in societally critical domains such as healthcare, education and governance, it is critical to correctly predict and understand the causal effects of these interventions. Without an A/B test, conventional machine learning methods, built on pattern recognition and correlational analyses, are insufficient for causal reasoning. Much like machine learning libraries have done for prediction, "DoWhy" is a Python library that aims to spark causal thinking and analysis. DoWhy provides a unified interface for causal inference methods and automatically tests many assumptions, thus making inference accessible to non-experts.

Foundations of causal inference and its impacts on machine learning webinar

Many key data science tasks are about decision-making. They require understanding the causes of an event and how to take action to improve future outcomes. Machine learning (ML) models rely on correlational patterns to predict the answer to a question but often fail at these decision-making tasks, as the very decisions and actions they drive change the patterns they rely on. Causal inference methods, in contrast, are designed to rely on patterns generated by stable and robust causal mechanisms, even as decisions and actions change. With insights gained from causal methods, the new, growing field of causal machine learning promises to address fundamental ML challenges in generalizability, interpretability, bias, and privacy. In this webinar, join Microsoft researchers Amit Sharma and Emre Kıcıman to learn about the fundamentals…