.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/example1.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_example1.py: Apply and compare several estimators to conduct a causal mediation analysis ======================= Establish identifiability ************************* **med_bench** implements several estimators for the natural direct and indirect causal effects. Before moving forward with estimation, the investigator should ensure that those causal effects are identified by discussing the plausability of the identification assumptions * SUTVA (Stable Unit Treatment Values Assumption) * Sequential ignorability of the treatment and the mediator(s), by selecting an adequate set of confounding variables that need to be adjusted on * Positivity of the treatment and the mediator. In this example we will admit those assumptions and simulate a dataset, with the following data generating process, with simple linear models. .. math:: X \sim \mathcal{N}(0, I_p) .. math:: T|X \sim \mbox{Bernoulli} (\mbox{expit}( a_0 + X^t a_X )) .. math:: M|X, T \sim b_0 + X^t b_X + b_T T + \mathcal{N}(0, \sigma_M^2) .. math:: Y|X, T, M \sim c_0 + c_T T+ X^t c_X + c_M M + \mathcal{N}(0, \sigma_Y^2) We use the function :func:`simulate_data ` to obtain a full simulated dataset for mediation analysis. .. GENERATED FROM PYTHON SOURCE LINES 26-51 .. code-block:: Python from med_bench.get_simulated_data import simulate_data from med_bench.estimation.mediation_coefficient_product import CoefficientProduct from numpy.random import default_rng import matplotlib.pyplot as plt import seaborn as sns import pandas as pd rg = default_rng(42) (x, t, m, y, total, theta_1, theta_0, delta_1, delta_0, p_t, th_p_t_mx) = \ simulate_data(n=500, rg=rg, mis_spec_m=False, mis_spec_y=False, dim_x=5, dim_m=1, seed=5, type_m='continuous', sigma_y=0.5, sigma_m=0.5, beta_t_factor=0.2, beta_m_factor=5) .. GENERATED FROM PYTHON SOURCE LINES 52-53 We can check the true values of the effects .. GENERATED FROM PYTHON SOURCE LINES 54-60 .. code-block:: Python print_effects = ('total effect: {:.2f}\n' 'direct effect: {:.2f}\n' 'indirect effect: {:.2f}') print('True effects') print(print_effects.format(total, theta_1, delta_0)) .. rst-class:: sphx-glr-script-out .. code-block:: none True effects total effect: 1.70 direct effect: 1.20 indirect effect: 0.50 .. GENERATED FROM PYTHON SOURCE LINES 61-62 Contrary to the sequential ignorability assumption, the positivity assumption can be checked experimentally (not a guarantee but a good indication). We represent the distribution of :math:`P(T=1|X, M)` for the treated and the untreated. .. GENERATED FROM PYTHON SOURCE LINES 63-67 .. code-block:: Python th_df = pd.DataFrame(zip(th_p_t_mx, t.ravel()), columns=['th_p_t_mx', 't']) sns.displot(data=th_df, x='th_p_t_mx', hue='t') plt.show() .. image-sg:: /auto_examples/images/sphx_glr_example1_001.png :alt: example1 :srcset: /auto_examples/images/sphx_glr_example1_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 68-71 Apply a baseline causal mediation estimator to your data: the coefficient product *************************************************************************** We se that there are individuals from both treatment groups for all propensity values, which is supporting (bot not proving) the positivity assumption. Let's now proceed with estimation. We begin with the simple coefficient product approach. .. GENERATED FROM PYTHON SOURCE LINES 72-80 .. code-block:: Python estimator = CoefficientProduct(regularize=False) estimator.fit(t, m, x, y) causal_effects = estimator.estimate(t.ravel(), m, x, y.ravel()) print('Estimated effects with the coefficient product') print(print_effects.format(causal_effects["total_effect"], causal_effects["direct_effect_treated"], causal_effects["indirect_effect_control"])) .. rst-class:: sphx-glr-script-out .. code-block:: none Nuisance models fitted Estimated effects with the coefficient product total effect: 1.78 direct effect: 1.23 indirect effect: 0.54 .. GENERATED FROM PYTHON SOURCE LINES 81-84 Comparaison with the other estimators *************************************** upcoming .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.325 seconds) .. _sphx_glr_download_auto_examples_example1.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: example1.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: example1.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: example1.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_