The simplest installation is through pip or conda:
.. tab-set-code:: .. code-block:: pip pip install dowhy .. code-block:: conda conda install -c conda-forge dowhy
Further installation scenarios and instructions can be found at :doc:`install`.
In this section, we will show the "Hello world" version of DoWhy. DoWhy is based on a simple unifying language for causal inference, unifying two powerful frameworks, namely graphical causal models (GCM) and potential outcomes (PO). It uses graph-based criteria and do-calculus for modeling assumptions and identifying a non-parametric causal effect.
To get you started, we introduce two features out of a large variety of features DoWhy offers.
For effect estimation, DoWhy switches to methods based primarily on potential outcomes. To do it, DoWhy offers a simple 4-step recipe consisting of modeling a causal model, identification, estimation, and refutation:
from dowhy import CausalModel
import dowhy.datasets
# Generate some sample data
data = dowhy.datasets.linear_dataset(
beta=10,
num_common_causes=5,
num_instruments=2,
num_samples=10000)
# Step 1: Create a causal model from the data and given graph.
model = CausalModel(
data=data["df"],
treatment=data["treatment_name"],
outcome=data["outcome_name"],
graph=data["gml_graph"])
# Step 2: Identify causal effect and return target estimands
identified_estimand = model.identify_effect()
# Step 3: Estimate the target estimand using a statistical method.
estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.propensity_score_matching")
# Step 4: Refute the obtained estimate using multiple robustness checks.
refute_results = model.refute_estimate(identified_estimand, estimate,
method_name="random_common_cause")
To understand what these four steps mean (and why we need four steps), the best place to learn more is the user guide's :doc:`../user_guide/causal_tasks/estimating_causal_effects/index` chapter. Alternatively, you can dive into the code and explore basic features in :doc:`../example_notebooks/dowhy_simple_example`.
For estimation of conditional effects, you can also use methods from EconML using the same API, refer to :doc:`../example_notebooks/dowhy-conditional-treatment-effects`.
For features like root cause analysis, point-wise counterfactual inference, structural analysis and similar, DoWhy uses graphical causal models. The language of graphical causal models again offers a variety of causal questions that can be answered. DoWhy's API to answer these causal questions follows a simple 3-step recipe as follows:
import networkx as nx, numpy as np, pandas as pd
from dowhy import gcm
# Let's generate some "normal" data we assume we're given from our problem domain:
X = np.random.normal(loc=0, scale=1, size=1000)
Y = 2 * X + np.random.normal(loc=0, scale=1, size=1000)
Z = 3 * Y + np.random.normal(loc=0, scale=1, size=1000)
data = pd.DataFrame(dict(X=X, Y=Y, Z=Z))
# Step 1: Model our system:
causal_model = gcm.StructuralCausalModel(nx.DiGraph([('X', 'Y'), ('Y', 'Z')]))
gcm.auto.assign_causal_mechanisms(causal_model, data)
# Step 2: Train our causal model with the data from above:
gcm.fit(causal_model, data)
# Step 3: Perform a causal analysis. For instance, root cause analysis, where we observe
anomalous_sample = pd.DataFrame(dict(X=[0.1], Y=[6.2], Z=[19])) # Here, Y is the root cause.
# ... and would like to answer the question:
# "Which node is the root cause of the anomaly in Z?":
anomaly_attribution = gcm.attribute_anomalies(causal_model, "Z", anomalous_sample)
If you want to learn more about this and other GCM features, we recommend starting with :doc:`../user_guide/modeling_gcm/index` in the user guide or check out :doc:`../example_notebooks/gcm_basic_example`.
There are further resources available:
- An introductory tutorial on causal inference
- A comprehensive tutorial on Causal Inference and Counterfactual Reasoning at the ACM Knowledge Discovery and Data Mining 2018 conference
- A video introduction to the four steps of causal inference and its implications for machine learning from Microsoft Research: Foundations of causal inference and its impacts on machine learning
- The PDF book Elements of Causal Inference
- Draft chapters of an upcoming book: Causal reasoning: Fundamentals and machine learning applications
- A blog post describing one of DoWhy's root cause analysis algorithms via graphical causal models: New method identifies the root causes of statistical outliers