DoWhy is hosted on GitHub.
You can browse the code in a html-friendly format here.
- [Major] Faster backdoor identification with support for minimal adjustment, maximal adjustment or exhaustive search. More test coverage for identification.
- [Major] Added new functionality of causal discovery [Experimental]. DoWhy now supports discovery algorithms from external libraries like CDT. [Example notebook]
- [Major] Implemented ID algorithm for causal identification. [Experimental]
- Added friendly text-based interpretation for DoWhy's effect estimate.
- Added a new estimation method, distance matching that relies on a distance metrics between inputs.
- Heuristics to infer default parameters for refuters.
- Inferring default strata automatically for propensity score stratification.
- Added support for custom propensity models in propensity-based estimation methods.
- Bug fixes for confidence intervals for linear regression. Better version of bootstrap method.
- Allow effect estimation without need to refit the model for econml estimators
Big thanks to @AndrewC19, @ha2trinh, @siddhanthaldar, and @vojavocni
- [Major] Placebo refuter also works for IV methods
- [Major] Moved matplotlib to an optional dependency. Can be installed using pip install dowhy[plotting]
- [Major] A new method for generating unobserved confounder for refutation
- Update to align with EconML's new API
- All refuters now support control and treatment values for continuous treatments
- Better logging configuration
- Dummyoutcomerefuter supports unobserved confounder
A big thanks to @arshiaarya, @n8sty, @moprescu and @vojavocni
Installation
- DoWhy can be installed on Conda now!
Code
- Support for identification by mediation formula
- Support for the front-door criterion
- Linear estimation methods for mediation
- Generalized backdoor criterion implementation using paths and d-separation
- Added GLM estimators, including logistic regression
- New API for interpreting causal models, estimates and refuters. First interpreter by @ErikHambardzumyan visualizes how the distribution of confounder changes
- Friendlier error messages for propensity score stratification estimator when there is not enough data in a bin
- Enhancements to the dummy outcome refuter with machine learned components--now can simulate non-zero effects too. Ready for alpha testing
Docs
- New case studies using DoWhy on hotel booking cancellations and membership rewards programs
- New notebook on using DoWhy+EconML for estimating effect of multiple treatments
- A tutorial on causal inference using DoWhy and EconML
- Better organization of docs and notebooks on the documentation website.
Community
- Created a contributors page with guidelines for contributing
- Added allcontributors bot so that new contributors can added just after their pull requests are merged
A big thanks to @Tanmay-Kulkarni101, @ErikHambardzumyan, @Sid-darthvader for their contributions.
- DummyOutcomeRefuter now includes machine learning functions to increase power of the refutation.
- In addition to generating a random dummy outcome, now you can generate a dummyOutcome that is an arbitrary function of confounders but always independent of treatment, and then test whether the estimated treatment effect is zero. This is inspired by ideas from the T-learner.
- We also provide default machine learning-based methods to estimate such a dummyOutcome based on confounders. Of course, you can specify any custom ML method.
- Added a new BootstrapRefuter that simulates the issue of measurement error with confounders. Rather than a simple bootstrap, you can generate bootstrap samples with noise on the values of the confounders and check how sensitive the estimate is.
- The refuter supports custom selection of the confounders to add noise to.
- All refuters now provide confidence intervals and a significance value.
- Better support for heterogeneous effect libraries like EconML and CausalML
- All CausalML methods can be called directly from DoWhy, in addition to all methods from EconML.
- [Change to naming scheme for estimators] To achieve a consistent naming scheme for estimators, we suggest to prepend internal dowhy estimators with the string "dowhy". For example, "backdoor.dowhy.propensity_score_matching". Not a breaking change, so you can keep using the old naming scheme too.
- EconML-specific: Since EconML assumes that effect modifiers are a subset of confounders, a warning is issued if a user specifies effect modifiers outside of confounders and tries to use EconML methods.
- CI and Standard errors: Added bootstrap-based confidence intervals and standard errors for all methods. For linear regression estimator, also implemented the corresponding parametric forms.
- Convenience functions for getting confidence intervals, standard errors and conditional treatment effects (CATE), that can be called after fitting the estimator if needed
- Better coverage for tests. Also, tests are now seeded with a random seed, so more dependable tests.
Thanks to @Tanmay-Kulkarni101 and @Arshiaarya for their contributions!
This release includes many major updates:
- (BREAKING CHANGE) The CausalModel import is now simpler: "from dowhy import CausalModel"
- Multivariate treatments are now supported.
- Conditional Average Treatment Effects (CATE) can be estimated for any subset of the data. Includes integration with EconML--any method from EconML can be called using DoWhy through the estimate_effect method (see example notebook).
- Other than CATE, specific target estimands like ATT and ATC are also supported for many of the estimation methods.
- For reproducibility, you can specify a random seed for all refutation methods.
- Multiple bug fixes and updates to the documentation.
Includes contributions from @j-chou, @ktmud, @jrfiedler, @shounak112358, @Lnk2past. Thank you all!
This is the first release of the library.