.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/example1.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_example1.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_example1.py:


Apply and compare several estimators to conduct a causal mediation analysis
=======================

Establish identifiability 
*************************

**med_bench** implements several estimators for the natural direct and indirect causal effects. Before moving forward with estimation, the investigator should ensure that those causal effects are identified by discussing the plausability of the identification assumptions

* SUTVA (Stable Unit Treatment Values Assumption)
* Sequential ignorability of the treatment and the mediator(s), by selecting an adequate set of confounding variables that need to be adjusted on
* Positivity of the treatment and the mediator.

In this example we will admit those assumptions and simulate a dataset, with the following data generating process, with simple linear models.

.. math::     X  \sim \mathcal{N}(0, I_p) 
.. math::     T|X  \sim \mbox{Bernoulli} (\mbox{expit}( a_0 + X^t a_X ))
.. math::     M|X, T  \sim b_0 + X^t b_X +  b_T T + \mathcal{N}(0, \sigma_M^2)
.. math::     Y|X, T, M  \sim c_0 + c_T T+ X^t c_X + c_M M + \mathcal{N}(0, \sigma_Y^2)

We use the function :func:`simulate_data <med_bench.get_simulated_data.simulate_data>` to obtain a full simulated dataset for mediation analysis.

.. GENERATED FROM PYTHON SOURCE LINES 26-51

.. code-block:: Python

    from med_bench.get_simulated_data import simulate_data
    from med_bench.estimation.mediation_coefficient_product import CoefficientProduct

    from numpy.random import default_rng
    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd

    rg = default_rng(42)

    (x, t, m, y, total, theta_1, theta_0,
     delta_1, delta_0, p_t, th_p_t_mx) = \
        simulate_data(n=500,
                      rg=rg,
                      mis_spec_m=False,
                      mis_spec_y=False,
                      dim_x=5,
                      dim_m=1,
                      seed=5,
                      type_m='continuous',
                      sigma_y=0.5,
                      sigma_m=0.5,
                      beta_t_factor=0.2,
                      beta_m_factor=5)


.. GENERATED FROM PYTHON SOURCE LINES 52-53

We can check the true values of the effects

.. GENERATED FROM PYTHON SOURCE LINES 54-60

.. code-block:: Python

    print_effects = ('total effect: {:.2f}\n'
                    'direct effect: {:.2f}\n'
                    'indirect effect: {:.2f}')
    print('True effects')
    print(print_effects.format(total, theta_1, delta_0))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    True effects
    total effect: 1.70
    direct effect: 1.20
    indirect effect: 0.50


.. GENERATED FROM PYTHON SOURCE LINES 61-62

Contrary to the sequential ignorability assumption, the positivity assumption can be checked experimentally (not a guarantee but a good indication). We represent the distribution of :math:`P(T=1|X, M)` for the treated and the untreated.

.. GENERATED FROM PYTHON SOURCE LINES 63-67

.. code-block:: Python

    th_df = pd.DataFrame(zip(th_p_t_mx, t.ravel()), columns=['th_p_t_mx', 't'])
    sns.displot(data=th_df, x='th_p_t_mx', hue='t')
    plt.show()


.. image-sg:: /auto_examples/images/sphx_glr_example1_001.png
   :alt: example1
   :srcset: /auto_examples/images/sphx_glr_example1_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 68-71

Apply a baseline causal mediation estimator to your data: the coefficient product
***************************************************************************
We se that there are individuals from both treatment groups for all propensity values, which is supporting (bot not proving) the positivity assumption. Let's now proceed with estimation. We begin with the simple coefficient product approach.

.. GENERATED FROM PYTHON SOURCE LINES 72-80

.. code-block:: Python

    estimator = CoefficientProduct(regularize=False)
    estimator.fit(t, m, x, y)
    causal_effects = estimator.estimate(t.ravel(), m, x, y.ravel())
    print('Estimated effects with the coefficient product')
    print(print_effects.format(causal_effects["total_effect"],
                               causal_effects["direct_effect_treated"],
                               causal_effects["indirect_effect_control"]))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Nuisance models fitted
    Estimated effects with the coefficient product
    total effect: 1.78
    direct effect: 1.23
    indirect effect: 0.54


.. GENERATED FROM PYTHON SOURCE LINES 81-84

Comparaison with the other estimators
***************************************
upcoming


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.325 seconds)


.. _sphx_glr_download_auto_examples_example1.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: example1.ipynb <example1.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: example1.py <example1.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: example1.zip <example1.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_