Causal mediation analysis estimating interventional effects mapped to a target trial
The R package medRCT
for causal mediation analysis supports the estimation of interventional effects (VanderWeele, Vansteelandt, and Robins 2014), specifically interventional effects that are defined such that they map explicitly to a “target trial” (Hernán and Robins 2016), as recently proposed by Moreno-Betancur et al. (2021). In the target trial, the treatment strategies are specified to reflect hypothetical interventions targeting and thus shifting the joint distribution of the mediators. medRCT
can accommodate any number of potentially correlated mediators, including mediators that are not of primary interest but that are intermediate (exposure-induced) mediator-outcome confounders.
Statement of need:
Causal mediation analysis generally seeks to investigate the extent to which the causal effect of an exposure on an outcome is mediated through intermediate variables. Natural (in)direct effects (Robins and Greenland 1992; Pearl 2001) were initially proposed as the estimands of interest in these analyses. Natural effects are defined based on cross-world counterfactuals (Robins and Richardson 2011) and their identifiability relies on a cross-world independence assumption. Given their reliance on cross-world counterfactuals, these effects have been criticized for not capturing the effects of interventions or policy measures that could be conducted in the real world (Naimi, Kaufman, and MacLehose 2014). Further, the independence assumption required can never be guaranteed, even in an experiment (Robins and Richardson 2011; Didelez, Dawid, and Geneletti 2006), and it renders the estimands unidentifiable in the common settings of exposure-induced mediator-outcome confounding and multiple mediators (Avin, Shpitser, and Pearl 2005; VanderWeele, Vansteelandt, and Robins 2014; Vansteelandt and VanderWeele 2012). However, in the context of multiple mediators, certain path-specific natural effects, also defined in terms of cross-world counterfactuals, can still be identified and may be of substantive interest (VanderWeele and Vansteelandt 2014).
Interventional effects have been proposed as an alternative to address these limitations. Firstly, these effects can be shown to map to a hypothetical randomized trial that evaluates the impact of hypothetical interventions shifting the distribution of the mediators (Moreno-Betancur and Carlin 2018). Secondly, interventional effects remain identifiable in the presence of exposure-induced mediator-outcome confounding and multiple interrelated mediators of interest.
The medRCT
package implements the estimation of interventional effects that are defined explicitly as effects in a hypothetical randomized trial (the target trial) , as proposed by Moreno-Betancur et al. (2021). This assists with clarifying the research question and ensuring that the study findings are meaningful and relevant to policy and practice. In the target trial, the treatment strategies are specified to reflect hypothetical interventions targeting and thus shifting the joint mediator distribution. The medRCT
package implements the estimation of interventional effects that correspond to effects of hypothetical interventions which:
shift the joint distribution of all mediators under exposure to that under no exposure,
shift the distribution of a specific mediator under exposure, given confounders, to match the corresponding distribution under no exposure, independent of and without considering flow-on effects on other mediators,
shift the distribution of a specific mediator under exposure, given confounders, to match the corresponding distribution under no exposure, while considering flow-on effects on causally descendant mediators.
medRCT
estimates these interventional effects using a Monte Carlo simulation-based g-computation approach. It should be noted that this method can be computationally intensive and is sensitive to model misspecification, as all nuisance parameters are estimated via restrictive parametric models.
Researchers should consider using medRCT
when their ultimate goal for conducting mediation analysis is to examine the effects of hypothetical interventions targeting multiple, potentially interdependent mediators.
Installation
The medRCT
package is not yet available on CRAN. You can install the latest stable version from GitHub using the following command:
remotes::install_github("T0ngChen/medRCT")
Example
Using a simulated dataset based on a published case study from the Longitudinal Study of Australian Children (Goldfeld et al. 2023), we illustrate how to use medRCT
to estimate the interventional effects that emulate a target trial. Specifically, we aim to estimate the difference in expected outcome (risk of child mental health problems) under exposure (low family socioeconomic position) with versus without a hypothetical intervention that individually shifts the distribution of each mediator (parental mental health and preschool attendance) to the levels in the unexposed (high family socioeconomic position), while accounting for baseline confounders, an intermediate (exposure-induced) mediator-outcome confounder (family stressful life events), and correlations amongst mediators.
We begin by loading the library and dataset, and defining the confounder vector.
# Load the medRCT package
library(medRCT)
# Set a seed for reproducibility
set.seed(2025)
# Display the first few rows of the dataset
head(LSACdata)
#> child_sex child_atsi mat_cob mat_engl mat_age sep fam_stress parent_mh
#> 1 0 1 0 0 1 0 0 0
#> 2 NA 0 0 0 NA 0 NA NA
#> 3 NA 0 0 0 NA 0 NA NA
#> 4 NA 0 0 0 NA 0 NA NA
#> 5 1 0 0 0 1 1 0 0
#> 6 1 0 0 0 1 0 1 1
#> preschool_att child_mh child_SDQscore
#> 1 1 0 8.924660
#> 2 0 0 7.349826
#> 3 0 1 12.824643
#> 4 0 0 6.611369
#> 5 0 1 10.329341
#> 6 0 1 13.552515
# Define confounders for the analysis
confounders <- c("child_sex", "child_atsi", "mat_cob", "mat_engl", "mat_age")
Next we run the analyses, estimating interventional effects for a hypothetical intervention that shifts the distribution of each mediator individually. Note 1: the dataset has missing data. Incomplete records are by default deleted before the analysis. Note 2: It is recommended to perform the analysis with at least 200 Monte Carlo simulations by setting mcsim = 200
. For illustration purposes, we use mcsim = 50
, which takes approximately 90 seconds to run.
# Estimate interventional effects for a hypothetical intervention
# that shifts the distribution of each mediator individually
med_res <- medRCT(
dat = LSACdata,
exposure = "sep",
outcome = "child_mh",
mediators = c("parent_mh", "preschool_att"),
intermediate_confs = "fam_stress", # intermediate confounders
confounders = confounders,
bootstrap = TRUE,
intervention_type = "shift_k",
mcsim = 50
)
#> Conducting complete case analysis, 2499 observations were excluded due to missing data.
#> Note: It is recommended to run analysis with no fewer than 200 Monte Carlo simulations.
# Summarise the results
summary(med_res)
#>
#> Estimated interventional indirect effect:
#>
#> Estimate Std. Error CI Lower CI Upper p-value
#> IIE_1 (p_trt - p_1) 0.011155 0.004181 0.002814 0.019203 0.0076
#> IIE_2 (p_trt - p_2) -0.000763 0.002501 -0.005443 0.004362 0.7604
#> TCE (p_trt - p_ctr) 0.128669 0.024554 0.082420 0.178668 1.6e-07
#>
#> Estimated interventional direct effect:
#>
#> Estimate Std. Error CI Lower CI Upper p-value
#> IDE_1 (p_1 - p_ctr) 0.1175 0.0247 0.0712 0.1679 1.9e-06
#> IDE_2 (p_2 - p_ctr) 0.1294 0.0244 0.0833 0.1789 1.1e-07
#>
#> Estimated expected outcome in each trial arm:
#>
#> Estimate Std. Error CI Lower CI Upper p-value
#> p_1 0.3302 0.0225 0.2872 0.3755 <2e-16
#> p_2 0.3421 0.0221 0.2995 0.3862 <2e-16
#> p_ctr 0.2127 0.0100 0.1922 0.2315 <2e-16
#> p_trt 0.3413 0.0223 0.2987 0.3860 <2e-16
#>
#> Sample Size: 2608
#>
#> Simulations: 50
#>
#> Effect Measure: Risk Difference
#> Results are based on all 100 bootstrap samples.
Based on the estimated interventional effect (IIE_1), a hypothetical intervention improving the mental health of parents of children from families with low socioeconomic position to the levels of those from families with high socioeconomic position could potentially prevent 1 per 100 cases of child mental health problems. Meanwhile, the effect of a hypothetical intervention on preschool attendance (IIE_2) is negligible.
For detailed guidance on using the package to handle more complex scenarios, please refer to the vignette.
Citation
For work involving the medRCT
R package, please cite the following:
@article{Chen2025medRCT,
author = {Tong Chen and S. Ghazaleh Dashti and Margarita Moreno-Betancur},
title = {{medRCT}: Causal mediation analysis estimating interventional effects mapped to a target trial in {R}},
year = {2025},
doi = {10.21105/joss.08063},
url = {https://doi.org/10.21105/joss.08063},
journal = {Journal of Open Source Software},
volume = {10},
number = {110},
pages = {8063},
publisher = {The Open Journal}
}
@article{Moreno2021Mediation,
author={Margarita Moreno-Betancur and Paul Moran and Denise Becker and George C Patton and John B Carlin},
title={Mediation effects that emulate a target randomised trial: Simulation-based evaluation of ill-defined interventions on multiple mediators},
journal={Statistical Methods in Medical Research},
volume={30},
number={6},
pages={1395--1412},
year={2021},
URL={https://doi.org/10.1177/0962280221998409},
doi={10.1177/0962280221998409},
publisher={SAGE Publications Ltd}
}