Information

W01: Continuous Treatment Effects with Right-Censored Data: Uncertainty Quantification and Sensitivity Analysis

Author: Yiqi Wu, School of Data Sciences, Zhejiang University of Finance and Economics, China

Abstract

In precision medicine, robust uncertainty quantification for individual treatment effects (ITEs) under continuous treatments with right-censored outcomes remains a key challenge. We extend weighted conformal prediction to this setting via a propensity score-based weighting scheme, handling both known and unknown propensity scores. The proposed method constructs valid prediction intervals for potential outcomes with proven finite-sample coverage guarantees under standard causal assumptions. A central innovation is a sensitivity analysis framework quantifying ITE interval robustness to unconfoundedness violations. This allows practitioners to assess the impact of hidden confounding in real-world applications. Empirical evaluations on synthetic and real-world datasets demonstrate that our approach maintains reliable coverage probabilities and yields practically informative interval lengths across various censoring and confounding scenarios, bridging rigorous causal inference with distribution-free uncertainty quantification. Keywords: conformal prediction; continuous treatment effects; right-censored data; sensitivity analysis; precision medicine

Coauthors: Ji Luo(School of Data Sciences, Zhejiang University of Finance and Economics, China)

W02: Patient-Level Time–Varying Treatment Effects for Right-Censored Outcomes

Author: Alex Fernandes, Université Paris Cité and Université Sorbonne Paris Nord, Inserm, INRAE, Centre for Research in Epidemiology and Statistics (CRESS), 75004, Paris, France

Abstract

Surv-CATE addresses individualized, time-varying conditional average treatment effect (CATE) estimation for right-censored time-to-event outcomes. We formalize three patient-level causal targets: (i) a treatment-effect curve over follow-up time (ii) a time-varying conditional restricted mean survival time difference (CRMST difference) and (iii) time‑varying conditional probability of effect (CPOE). We establish identifiability results for the two first and identifiable bounds for the last. Moreover, we propose Surv–CATE, an estimator for the effect curve based an inverse-probability-of-censoring weighting adaptation of the R-learner. To capture heterogeneous and non-proportional treatment dynamics, Surv–CATE uses a covariate-gated mixture representation of survival trajectories. We evaluate Surv–CATE in simulations calibrated to a stroke trial and on a real-world coronary revascularization cohort comparing coronary artery bypass grafting with percutaneous coronary intervention.

Coauthors: François Grolleau, Department of Biomedical Data Science, Stanford University

W03: Average Mixed Derivative: A Nonparametric Framework of Interactivity

Author: Alissa Gordon, Division of Biostatistics, University of California Berkeley, USA

Abstract

There are many ways that multiple theoretically intervenable variables can synergize or antagonize to affect an outcome. We review several definitions of “interaction” and propose a new estimand called the average mixed derivative (AMD) that captures a scalar notion of interactivity within a nonparametric framework. The AMD generalizes the traditional regression-based interaction product term in the same way that the average derivative effect generalizes a main-effect coefficient. We identify the AMD under standard causal assumptions and develop regression, weighting, and doubly-robust type estimators for the AMD, focusing on continuous exposures and outcomes. We leverage recent advancements in Riesz regression for the use of off-the-shelf machine learning methods and demonstrate favorable finite sample properties through simulation study. This approach will allow researchers to reliably assess interaction effects of interventions and make informed decisions when combining multiple exposures.

Coauthors: Alejandro Schuler (Division of Biostatistics, University of California Berkeley, USA); Oliver Hines (Department of Biostatistics, Columbia University, USA)

W04: Robust Causal Structure Discovery Under Mixed Data Types and Latent Confounding: A Comparative Evaluation Using Bootstrap Stability and Subset Agreement Metrics

Author: Alix Pelletier-Roland, Master’s student, ETH Zürich, Switzerland/ Compo Inria Aix-Marseille Université

Abstract

Recovering causal structure from observational biomedical data is difficult because variables are mixed type, missingness is common, relations are nonlinear and latent confounding may be present. I compare major causal discovery families on pre treatment clinical and biomarker data: constraint based algorithms, score based search, information theoretic methods, functional causal models and continuous optimization approaches. To assess robustness without ground truth, I use two perturbation strategies. First, bootstrap resampling of patients and comparison of each bootstrap graph with the full data graph using structural Hamming distance, skeleton F1 and edge F1. Second, exhaustive analysis of all three variable subsets, where each method is compared to the latent projection of its full graph on the same subset. Methods that account for latent variables or multivariate information, such as FCI, GFCI and iMIIC, show the most stable and interpretable structures. Optimization and deep learning methods capture nonlinear effects but tend to be less stable or overly dense. The results (with the associated causal app created) provide guidance for selecting reliable causal discovery methods in noisy observational settings.

W05: Learning Directed Graph Structure from Mixed Continuous–Discrete Data with Predictive Coding

Author: Amine M’Charrak, University of Oxford, Computer Science, United Kingdom

Abstract

Causal discovery from observational data with both continuous and discrete variables remains difficult in practice. Constraint-based pipelines depend on mixed-type conditional-independence tests, while score-based methods often require specialised likelihoods or separate handling of node types, complicating optimisation and downstream effect estimation. We propose PredCoM, a predictive-coding framework for learning sparse weighted directed acyclic graphs from mixed data by minimising a single regularised negative log-likelihood built from node-wise exponential-family conditionals. Continuous nodes use Gaussian losses and discrete nodes use logistic losses, yielding one differentiable objective over a shared weight matrix. PredCoM optimises this objective via predictive-coding dynamics that update edge weights from local prediction errors, together with a smooth acyclicity penalty that supports continuous optimisation without discrete graph search. The learned weighted graph can be used directly for structure evaluations and for estimating total causal effects under standard adjustment. Across mixed-confounding benchmarks and random graph families with varying discrete proportions, PredCoM is competitive with established baselines.

Coauthors: Tommaso Salvatori (Verses AI), Bayar Menzat (Vienna University of Technology), Thomas Lukasiewicz (University of Oxford, Vienna University of Technology)

W06: Generalising estimates from selectively sampled data: an application to psoriatic arthritis

Author: Annie Russell, University of Bath

Abstract

When key effect modifiers influence both study participation and treatment response, observational estimates may not generalise to the target population. In such settings, standard analyses recover effects for the analysis cohort rather than the population of interest, limiting clinical interpretability. We illustrate this problem using the SEQUENCE study, a novel observational comparison of treatment response to early- versus later-line immunotherapy in psoriatic arthritis (PsA). In SEQUENCE, patients receiving later-line therapies were purposively over-sampled to balance treatment groups. Because selection depended on the number of prior therapy lines—an effect modifier of treatment response—naïve estimates of response to therapy are biased. Using simulated data reflecting the SEQUENCE sampling mechanism, we propose inverse probability of participation weighting within a logistic generalised estimating equations framework to reweight the analysis sample so that the distribution of prior therapy lines matches a prespecified target PsA population. This approach yields interpretable population-level contrasts and demonstrates how generalisability methods can mitigate selection bias in real-world studies of sequential therapies.

W07: On the role of the Potential Outcome Association Structure for Principal Causal Effects

Author: Arianna Nuti, Department of Statistics, Computer Science, Applications, University of Florence

Abstract

In clinical trials, subgroup analyses based on biomarkers that may lie on the causal pathway between treatment and outcome can provide valuable clinical insight. Assessing causal effects among patients who respond to treatment by biomarker level, namely those with a sufficient biomarker reduction, may help physicians tailor therapy early. We formalize this question using principal stratification, recognized in the ICH E9(R1) addendum as an estimand strategy for post-treatment variables. The target estimand is the principal average causal effect for responders, defined as patients whose biomarker under treatment falls below a pre-specified threshold. We study how this effect depends on the association between the potential outcomes for the biomarker and the primary endpoint. Assuming a multivariate normal joint distribution of all potential outcomes, we explicitly express the causal effect as a function of the association parameters. We derive a closed-form formula showing sensitivity to non-identifiable association parameters. Our results clarify their role and show how causal conclusions depend on assumptions about them. These findings may extend beyond the normal setting to assess the plausibility of such assumptions.

Coauthors: Alessandra Mattei, Department of Statistics, Computer Science, Applications, University of Florence; Bjoern Bornkamp, Novartis Pharma AG, Basel, Switzerland; Tianmeng Lyu, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA

W08: Powerful Multivariate Sensitivity Analysis in an Observational Study of the Effects of Poverty on Cardiovascular Disease Risk Factors

Author: William Bekerman, Department of Statistics and Data Science, University of Pennsylvania, USA

Abstract

When assessing the causal effect of an exposure on two or more outcomes in an observational study, a linear combination of outcomes may lessen the sensitivity of a test of the global null hypothesis to potential unmeasured biases. While all linear combinations of scored outcomes can be considered using Scheff'e projections, or constrained variants thereof, finding the contrast that minimizes sensitivity to unmeasured biases requires corrections for multiple testing which can erode power, especially when many outcomes are of interest. To mitigate this issue, we propose splitting the sample into a planning sample to identify the optimal contrast and an analysis sample to conduct inference. We introduce a novel minimax theorem for this problem and find that the design sensitivity on the whole sample equals the design sensitivity when using split samples. We also conduct extensive simulation studies demonstrating enhanced power in finite samples. Finally, we apply our method to investigate the effects of low family income on the emergence of risk factors for cardiovascular disease in children and adolescents.

Coauthors: Anurag Mehta; Division of Cardiology, Emory University School of Medicine, USA Dylan S. Small; Department of Statistics and Data Science, University of Pennsylvania, USA Colin B. Fogarty; Department of Statistics, University of Michigan, USA

W09: Interpretable Radiomic Survival Analysis in Lung Cancer Using Deep Structural Causal Models

Author: Charlie Johnson, Department of Surgery and Cancer, Imperial College London, United Kingdom

Abstract

Radiomics enables quantitative characterization of tumors for precision care, yet most radiomic biomarkers lack causal justification. We propose a deep structural causal model (DSCM) that couples (i) a learned directed acyclic graph (DAG) over radiomic and clinical covariates with (ii) a causal autoregressive covariate generator trained via rectified-flow matching, and (iii) a discrete-time survival head that outputs counterfactual survival curves. Counterfactual queries follow abduction→intervention→prediction: we infer subject-specific latents, apply do(·) to a target feature, propagate effects to descendants under the DAG, and predict the resulting survival distribution. In simulated data with time-varying confounding and known ground-truth mechanisms, feature-level ATE and CATE estimates and counterfactual survival curves closely match ground truth (MSE<0.05). We then identified 1,173 adults with primary lung cancer and screened 1,409 features from CT tumor regions of interest. Applying the framework to seven survival-enriched candidates, the learned DAG constrains feasible interventions while the DSCM transports induced feature changes into counterfactual survival curves, supporting interpretable radiomic effect assessment.

Coauthors: Mitchell Chen, Department of Surgery and Cancer, Imperial College London Ainkaran Santhirasekaram, Institute of Clinical Sciences, Imperial College London

W10: Joint Modeling of Intervention Assignment and Outcomes: A Bayesian Causal Factor Approach

Author: Constantin Schmidt, MCR Biostatistics Unit, University of Cambridge

Abstract

We propose a novel Bayesian causal factor model for evaluating non-randomised interventions using time-series cross-sectional data. By directly imputing counterfactual outcomes for post-intervention observations, we estimate finite-sample estimands, such as individual and average treatment effects on the treated. Unlike existing approaches that rely solely on pre-intervention outcomes, our method jointly models the intervention assignment process alongside both pre- and post-intervention outcomes. This yields three distinct advantages. First, it improves the coverage of credible intervals, which we demonstrate via a simulation study. Second, it increases estimation efficiency for shared parameters. Third, it enables modelling the joint distribution of outcomes. To account for uncertainty in the joint distribution of potential outcomes, we introduce a novel, copula-based approach. Our method handles binary, count, and continuous data. We apply our method to evaluate an intervention promoting adherence to World Health Organisation antibiotic guidelines in a large Vietnamese hospital. We find evidence that the intervention reduced the number of antibiotic doses prescribed, but not the duration of treatment.

Coauthors: Pantelis Samartsidis (University of Cambridge); Tu Nguyen Thi Cam (Oxford University Clinical Research Unit, Hanoi); Huong Vu Thi Lan (Oxford University Clinical Research Unit, Hanoi); Sonia Lewycka (Oxford University Clinical Research Unit, Hanoi); Shaun R. Seaman (University of Cambridge); and Daniela De Angelis (University of Cambridge).

W11: Partial identification of causal effects in the presence of multiple versions of treatment

Author: Chang Wei, Erasmus MC, Department of Epidemiology

Abstract

To well define a causal question and estimate the causal effect, the treatment variation irrelevance assumption (TVIA) is required: all versions of the treatment are assumed to have the same effect on the outcome for all units. However, violations of TVIA are common in practice. Although the framework for multiple versions of treatment has been proposed, the corresponding target estimand represents an overall effect of the compound treatment. In contrast, in applied settings where only specific treatment versions are to be administered, version-specific effects are of primary interest. When individual-level information on treatment versions is unavailable, in the context of studies with a dichotomized treatment with multiple versions within one treatment arm, we derive large-sample bounds for version-specific treatment effects. We show that, for a given treatment version, the bounds can be obtained using knowledge of a lower bound on the proportion of that version in the study population, which may be obtained from external data or domain expertise. These bounds can be further narrowed under additional nonparametric assumptions.

Coauthors: Jeremy A. Labrecque

W12: DoCluster: From Algebraic Mutations to Causal Representation Learning in Computational Biology

Author: Aditya Raj Dash, Department of Mathematics and Computer Science, Freie University of Berlin, Germany

Abstract

Causal Inference primarily reasons about cause–effect relationships. These relationships rely on interventions and are commonly formalized via Judea Pearl’s do-operator. However, the do-calculus framework assumes acyclicity and identifiability, which may limit its applicability in complex domains such as biology, where feedback loops and regulatory interactions are prevalent. We propose a novel bridge between cluster algebras and causal modeling. Cluster algebras are commutative rings with a set of generators that possess a remarkable combinatorial structure. We introduce the idea of Cluster Causal Mutations (CCM), where algebraic mutation operations serve as a generalized extension of the standard do-operator. We show that CCMs preserve the Laurent phenomenon. This preservation ensures identifiability in structural equation models by expressing variables as Laurent polynomials of the initial seed variables. We provide empirical results on synthetic data and preliminary experiments on gene regulatory networks, demonstrating that CCMs capture richer causal structures and semantics than do-calculus in systems with feedback. The framework opens a new algebraic perspective on causal representation learning & uncover latent structures.

W13: Temporal Regime Causal Discovery for Electricity Price Estimation

Author: Dennis Thumm, National University of Singapore

Abstract

We present a novel supervised causal discovery method in electricity markets. Traditional machine learning approaches to electricity price modeling provide accurate predictions but lack interpretability and causal semantics. We address this gap by introducing a target constraint that restricts the response variable to be a sink node in the learned causal graph, enabling simultaneous causal structure learning and prediction. Our method handles non-Gaussian noise distributions, instantaneous effects, and autonomous regime detection—capabilities not jointly available in existing approaches. Applied to German and French electricity markets, our approach achieves Spearman correlations of 0.484 and 0.296 respectively, representing 205% and 86% improvements over linear baselines. The discovered causal structures reveal interpretable drivers: wind power and residual load dominate instantaneous effects, while coal returns emerge as the strongest lagged predictor. Regime detection identifies distinct market phases with carbon-driven versus renewable-driven causal mechanisms. Our framework bridges causal discovery and supervised learning, providing a foundation for interventional policy analysis in energy markets.

W14: Target trial emulation of the FIRST-line support for Assistance in Breathing in Children (FIRST-ABC) step-up randomised clinical trial using Paediatric Intensive Care Audit Network (PICANet) routine data: a cautionary tale

Author: Elisa Giallongo, Intensive Care National Audit & Research Centre (ICNARC), London, UK

Abstract

The target trial methodology addresses the need for stronger causal inference from observational studies as alternatives to RCTs. This study is part of a wider project integrating information from a target trial emulation (TTE) and its emulated RCT using Bayesian dynamic borrowing. As a motivating example, we emulated the FIRST-ABC step-up trial using high-quality routinely collected PICANet data, comparing two modes of noninvasive respiratory support on time to liberation. Although we carefully designed the study with clinical expertise, adhered closely to the target trial framework, and applied advanced statistical methods, the primary effect estimate differed markedly from the RCT findings. To unpick why this happened, we critically appraised the emulation against its benchmarking trial using the ROBINS-I tool in each target trial framework component, unravelling sources of bias from eligibility criteria, outcome measurement, and unobserved confounders. The assessment revealed the challenges of replicating an ideal trial from routine data, uncovering hidden pitfalls at every stage of the emulation process. We offer a transparent walkthrough and lessons learned, hoping to help future researchers in designing and conducting TTEs.

Coauthors: Dr Orlagh Carroll PhD², Prof Padmanabhan Ramnarayan MD³, Dr Rebecca Mitting MRCPCH⁴, Dr Sarah E Seaton PhD⁵, Mr Dermot Shortt BE⁶, Dr Alexina J Mason PhD¹, Prof David A Harrison PhD¹, Prof Richard Grieve PhD⁷ Affiliations ¹Intensive Care National Audit & Research Centre (ICNARC), London, UK ²Department of Infectious Disease Epidemiology and International Health, London School of Hygiene & Tropical Medicine, London, UK ³Department of Surgery and Cancer, Imperial College London, London, UK ⁴Paediatric Intensive Care Unit, Imperial College Healthcare NHS Trust, London, UK ⁵Department of Population Health Sciences, University of Leicester, Leicester, UK ⁶Patient representative, c/o Clinical Trials Unit, Intensive Care National Audit & Research Centre (ICNARC), London, UK ⁷Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, London, UK

W15: Are cash reallocations effective amid multiple shocks? A double selection approach

Author: Ellestina Jumbe, Department of Economics and Finance, University of Rome Tor Vergata, Italy

Abstract

In the literature, there is limited evidence on how cash reallocations respond to unanticipated shocks during the project implementation phase. In practice, reprogramming tends to be rigid due to logistical constraints, despite the need for flexibility in fragile environments. This paper employs a causal machine learning approach to evaluate whether cash reallocations targeting pregnant and lactating women in Myanmar were effective amid COVID-19 and armed conflict shocks. Results show an increase of 5 to 6 points in food consumption score among households located farther from conflict zones while no gains are observed among those within 4 km of conflict areas. Households that did not report any shock saw an increase in per capita food expenditure between 6 to 7 USD. These effects are statistically significant only when machine learning methods are used and remain insignificant under conventional estimation. The findings highlight: (i) the need for adaptive social protection mechanisms in fragile settings during unanticipated shocks; and (ii) the value of machine learning techniques to achieve more precise estimates by controlling for high-dimensional confounders, enhancing the reliability of impact evaluations.

W16: Comparison of prevalent new-user and sequential trial designs for studying the real-world effects of treatment discontinuation in cystic fibrosis

Author: Elliot McClenaghan, Department of Medical Statistics, London School of Hygiene & Tropical Medicine (LSHTM), UK

Abstract

Background: Optimal application of target trial emulation to estimate causal effects of treatment discontinuation remains unclear, particularly given challenges such as defining time zero. Objectives: To compare estimands, implementation, and interpretation of two designs for studying discontinuation: (i) prevalent new-user design and (ii) sequential trials design. Methods: We present a cohort study in cystic fibrosis, based on the CF-WISE randomised controlled trial. Using the UK CF Registry, we estimate the effect of discontinuing inhaled corticosteroids on time to first pulmonary exacerbation. We explicitly contrast the estimands and analytical implications of each design. Results: The PNUD targets the average treatment effect in the treated (ATT); how outcomes among discontinuers would have differed on average, had they continued. The sequential trials design targets the average treatment effect in the population (ATE); how outcomes would differ if all individuals discontinued versus continued treatment. Estimating effects of discontinuation involves additional complexities compared with treatment initiation. Clear specification of target trial protocols is essential to ensure emulations address a relevant causal question.

Coauthors: Julie Rouette (Global Epidemiology, GSK, Montreal, Canada). Emily Granger (Department of Medical Statistics, LSHTM, London, UK). Gwyneth Davies (Population, Policy & Practice Dept, UCL Great Ormond Street Institute of Child Health, UCL Great Ormond Street Institute of Child Health, London, UK). Ruth Keogh (Department of Medical Statistics, LSHTM, London, UK). John Tazare (Department of Medical Statistics, LSHTM, London, UK).

W17: On the estimation of principal causal effects with survival intermediate variables under principal ignorability

Author: Emma Torrini, University of Florence, Italy

Abstract

Randomized clinical trials often involve survival variables, which typically represent the endpoint of interest. Post-randomization events may complicate the statistical analysis, and principal stratification has been proposed as a framework to address such situations. However, its applicability when post-randomization variables are themselves of a survival nature remains underexplored. We study the identifiability and estimation of principal causal effects in the presence of both an intermediate survival variable and a survival outcome, adopting a novel version of the principal ignorability assumption. We first explore connections with the principal score literature, discussing how commonly invoked assumptions are insufficient to nonparametrically identify principal causal effects when both the intermediate and outcome variables are subject to censoring. We then propose to address the inferential challenges using a likelihood-based approach which exploits finite mixture models. We maintain a principal ignorability assumption but we do not impose any monotonicity assumption, and we discuss how our approach enables investigating the plausibility of meaningful monotonicity assumptions.

Coauthors: Alessandra Mattei (University of Florence, Italy)

W18: Causal ICM: A Gaussian Process Framework for Sensitivity Analysis in Data Fusion

Author: Evangelos Dimitriou, Department of Statistical Science, University College London

Abstract

Assessing the robustness of causal conclusions to unobserved confounding is central in causal inference, particularly when combining small but unbiased randomised controlled trials (RCTs) with large but potentially biased real-world data (RWD). We propose the Causal Intrinsic Coregionalization Model (Causal-ICM), a Bayesian framework based on multi-task Gaussian processes that jointly models outcomes from experimental and observational studies. A key component is a sensitivity parameter, ρ, which captures the correlation between latent outcome functions across domains and controls information sharing, preventing observational data from dominating the RCT signal. We study both the Conditional Average Treatment Effect (CATE) and the Average Treatment Effect (ATE), characterising their posterior distribution as functions of ρ. We derive bounds on changes in posterior mean and variance induced by incorporating observational data and show that the posterior variance remains strictly positive even with infinite observational sample size. We demonstrate the performance of Causal-ICM through simulations and a real-world RCT, highlighting its robust performance in data fusion and sensitivity analysis.

Coauthors: Edwin Fong, Department of Statistics and Actuarial Science, University of Hong Kong Jens Magelund Tarp, Novo Nordisk A/S Karla Diaz-Ordaz, Department of Statistical Science, University College London Brieuc Lehmann, Department of Statistical Science, University College London

W19: Towards Proximal Reinforcement Learning

Author: Ewan Burns, Computer Science, UCL, United Kingdom

Abstract

In this work we explore causal world models under unobserved confounding.

In model-based RL we usually aim to learn a transition model P(st+1∣st,at) which we use for planning and counterfactual reasoning.

However, in real-world settings there are latent variables influencing both actions and transitions.

Proximal causal inference could be used for identification of optimal policies using proxies and negative control variables.

Furthermore, partial identification could provide a way to give bounds in settings where full identification is not possible.

Our initial work focuses on SCMs as simple world models with latent confounders and observed proxies.

Our contribution is to explore under what proximal assumptions we can identify counterfactuals. We also study settings where assumptions are too strong where causal bounds on those quantities can help with optimal decision making.

Finally, we attempt initial empirical illustrations via toy environments.

Coauthors: Jakob Zeitler (Department of Statistics, Oxford, United Kingdom)

W20: A distance metric approach for valid IV selection in two-sample Mendelian randomisation

Author: Fatima Kasenally, Department of Statistics, University of Oxford, United Kingdom

Abstract

Selecting valid instruments is a central challenge in two-sample Mendelian randomisation (MR) when a sub-set of genetic variants are subject to horizontal pleiotropy. We introduce two asymptotically consistent and computationally efficient selection procedures that recover the largest homogeneous set of valid instruments under the plurality rule. Both procedures are built around a pairwise distance metric between per-variant Wald ratio estimators and employ Cochran’s Q test as an explicit stopping and validation mechanism. The first procedure uses a greedy clustering algorithm with upwards selection, while the second adopts a voting-based approach with downwards elimination. Compared to existing outlier detection methods, including MR-PRESSO, our framework offers improved computational scalability and a unified statistical justification for instrument selection in two-sample MR.

Coauthors: Frank Windmeijer, Department of Statistics, University of Oxford, United Kingdom

W21: Subsidizing Public Transport: Short-Term Gains, Long-Term Impacts?

Author: Giovanni Rodriguez-Fernandez, Complutense University of Madrid, Spain

Abstract

This paper provides causal evidence on the short- and long-term effects of a long-duration public transport on travel behavior. Using detailed individual-level mobility data from the 2015 quasi-experiment in Madrid for 25-year-olds and a regression discontinuity design, I analyze the impact on public transport, car usage, and walking both during and after the subsidy period. The subsidy increases public transport use by 6.3 percentage points and reduces car use by 6.5 percentage points in the short term. The increase in public transport use declines to 4.5 percentage points six months after eligibility ends and persists thereafter only among prior users whose choices are driven by external factors such as price, time, or traffic. Our results show that subsidies generate substantial temporary modal shifts but fail to durably change travel habits, highlighting the limits of temporary incentives and providing actionable insights for designing targeted and sustainable urban mobility policies.

W22: Assessing Short-Term Effects of Air Pollution on Health: a kernel Matching Estimator for Continuous Exposures

Author: Giulio Biscardi, Department of Informatics, Statistics and Application (DISIA), Florence

Abstract

In studies of the short-term effects of environmental pollutants on health outcomes, the shape of the average dose-response function (aDRF) has important regulatory implications, and it is essential for quantifying the health impact of exposures. Within the potential outcomes framework for continuous exposures, and under local unconfoundedness and overlap assumptions, we propose a novel local generalized propensity score (GPS) kernel matching approach. The method imputes potential outcomes at selected exposure levels using kernel-weighted averages based on both GPS and exposure proximity, and then recovers the aDRF through a flexible outcome–exposure model. In simulations, we find that our method outperforms existing approaches, including the GPS matching estimator recently developed by Wu et al. (2024), our closest competitor. We apply the method to evaluate the short-term effects of PM10 on natural deaths from all causes in the metropolitan area of Milan (Italy, 2003-2006). The estimated aDRF shows a nonlinear shape and a positive slope indicating that higher PM10 concentrations result in progressively worse health outcomes, as reflected in higher mortality.

Coauthors: Michela Baccini Department of Informatics, Statistics and Application (DISIA), Alessandra Mattei Department of Informatics, Statistics and Application (DISIA), Florence

W23: Possibilistic Instrumental Variable Regression

Author: Gregor Steiner, University of Warwick

Abstract

Instrumental variable regression is a common approach for causal inference in the presence of unobserved confounding. However, identifying valid instruments is often difficult in practice. We propose a novel method based on possibility theory that performs posterior inference on the treatment effect, conditional on a user-specified set of potential violations of the exogeneity assumption. Our method can provide valid results even when only a single, potentially invalid, instrument is available. A theoretical coverage guarantee for our uncertainty intervals holds if the violation set contains the true value. Simulation experiments and real-data applications indicate strong performance of the proposed approach.

Coauthors: Jeremie Houssineau, Nanyang Technological University Mark Steel, University of Warwick

W24: Partial Identification of Conditional Average Treatment Effects under Hidden Confounding and Transportability Violations

Author: Guruvignesh Balaji, Statistical Science, University College London, United Kingdom

Abstract

Recent sensitivity analysis methods leverage a “gold standard’’ randomised controlled trial (RCT) to provide a lower bound on hidden confounding bias in an observational study (OS). However, these methods rely on the strong assumption of transportability, or conditional exchangeability for study selection. In practice, while well-executed RCTs are internally valid by design, they are often not representative of the broader population due to strict eligibility criteria, threatening their external validity. Conversely, an OS is typically more representative of the target population but, due to unmeasured confounding, may lack internal validity. Here, we explore how discrepancies between the conditional average treatment effect of an RCT and an OS can arise not only due to hidden confounding but also through transportability violations. We examine how different sensitivity models can be employed to obtain bounds for these discrepancies, and investigate empirically how the smoothness of the outcome model dictates the tightness and meaningfulness of these bounds through synthetic experiments. Finally, we suggest ways in which the RCT and OS data can be combined to obtain bounds on sensitivity parameters.

Coauthors: Brieuc Lehmann, Statistical Science, University College London & Karla Diaz-Ordaz, Statistical Science, University College London

W25: Causal Graphs for Conditional Parallel Trends

Author: Henri Pfleiderer, Data Science in Economics, University of Tübingen, Germany

Abstract

Difference-in-Differences is a common research design for causal inference that often relies on a conditional parallel trends (CPT) assumption. In contrast to other designs where conditional independence assumptions can be motivated by causal graphs, graphical tools for deriving valid adjustment sets that justify CPT are missing. We introduce transformed Single World Intervention Graphs (SWIGs), the ∆-SWIGs, and prove that they allow to read off conditional independencies implying CPT via d-separation, thereby identifying valid adjustment sets (good control variables). Using ∆-SWIGs, we derive sufficient conditions for CPT in complex settings with multiple time-periods and time-varying control variables, while previous approaches focus on necessary conditions and/or more simple settings. We establish different sources of biases, depending on the choice of control variables and dynamics between different variables. For example, in settings where control variables are affected by the treatment, short-term effects are identified, while long-term effects are biased. However, standard checks for parallel pre-trends would not detect a CPT violation, emphasizing the limitations of purely empirical justifications of CPT.

Coauthors: Michael C. Knaus, University of Tübingen

W26: Instrumental Variables for Multi-Treatment Trials: A Decision-Theoretic Approach

Author: Hongruyu Holly Chen, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Switzerland

Abstract

Non-compliance in randomized trials complicates the estimation of treatment effects. Instrumental variable (IV) methods provide an approach to estimating per-protocol effects without requiring full adjustment for unmeasured confounders. However, IV methods are most commonly developed for binary treatment settings, and key identifying assumptions—such as monotonicity—are often difficult to justify in practice. In this work, we studied a randomized encouragement design, comprising two treatment options and one control. We introduced a decision-theoretic approach to support IV assumptions in real-world clinical trials, enabling identification of local average treatment effects among compliers in multi-treatment settings. Using data from a colorectal cancer screening trial comparing colonoscopy and faecal immunochemical test with no screening, we estimated 10-year risk differences for colorectal cancer mortality, colorectal cancer incidence, and all-cause mortality. This work extends IV methodology to randomized trials with multiple treatment options and provides a novel decision-theoretic approach for key identifying assumptions, broadening the applicability of IV methods in complex clinical trials.

Coauthors: Hongruyu Holly Chen(1), Helena Aebersold(1), Milo Alan Puhan(1), Miquel Serra-Burriel(1) 1 Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Switzerland

W27: Using causal inference techniques to formally justify classification methods

Author: Ignacio Gonzalez Perez, Institute of Mathematics, EPFL, Switzerland

Abstract

In many applied fields, individuals are classified using numeric scores. For example, psychologists and physicians use screening tests to classify individuals as having cognitive impairment. Such scores are often “corrected” for fixed covariates, such as age, education and other demographics, before applying a diagnostic threshold. While such corrections are widespread, their motivation is “intuitive” rather than formal, and there is an ongoing debate in applied communities about whether corrections are desirable. This work aims to formally justify when corrections are desirable, and thus bring clarity to debates in psychology and medicine. Using tools from causal inference, yet without relying on causal models, parametric assumptions, or favourable empirical results, we show that raw scores can outperform corrected scores. Specifically, we derive interpretable sufficient conditions under which raw scores are guaranteed to dominate common corrections. Furthermore, we study whether these corrections are fair, tailoring causal notions fairness to the classification problem. Throughout, we use the age and education correction of the Mini Mental State Examination as motivation and to illustrate our results on real data.

Coauthors: Marco Piccininni, Institute of Mathematics, EPFL, Switzerland; Mats Julius Stensrud, Institute of Mathematics, EPFL, Switzerland

W28: Causal inference when treatment affects outcome ascertainment

Author: Isaac Núñez, CAUSALab, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States

Abstract

When outcome ascertainment is performed differentially by treatment level, a conventional causal analysis can yield effects on the detected outcome (Y) that systematically differ from those on the true outcome (Y). Measurement error literature addresses this using the diagnostic precision of a (possibly) hypothetical “test”, which can be used to algebraically adjust Y to obtain the effect on Y. We argue that this ascertainment problem is better framed as one of causal mediation, where the total effect on Y* is mediated through ascertainment procedures (M) and through Y. The controlled direct effect of treatment (A) on Y* under an intervention that fixes ascertainment procedures may be close to the effect of A on Y that is generally of interest. Further, a decomposition of A (e.g. in a factorial 2x2 trial) may sometimes be realistic, and yields the separable effect of A on Y* not through M (i.e. through Y). We give conditions for the identification of both types of effects. We formalize conditions for identifying the treatment effects on Y* while accounting for outcome ascertainment to reflect effects mediated by the true outcome, including the time-varying setting with treatment-confounder-feedback.

Coauthors: Vanessa Didelez; Leibniz Institute for Prevention Research and Epidemiology - BIPS, Bremen, Germany Department of Mathematics and Computer Science, University of Bremen, Bremen, Germany

Miguel A. Hernán; CAUSALab, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States

Mats J. Stensrud; Institute of Mathematics, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

W29: Decomposing and Interpreting the OLS Estimand

Author: Jacqueline Gut, Statistics, Econometrics and Empirical Economics, University of Tübingen, Germany

Abstract

Ordinary Least Squares (OLS) regression is widely used to estimate the effect of a binary treatment. We decompose the OLS estimand under SUTVA and unconfoundedness without functional form and/or overlap assumptions into i) non-standard Weighted Average Treatment Effects (WATE) with potentially negative weights plus misspecification bias, or ii) standard WATEs (e.g., ATE, ATT, ATU, ATO) plus misspecification bias and effect heterogeneity bias. Conceptually, we discuss canonical and novel assumptions under which the bias terms disappear. Practically, we show that the bias terms are identified under (one-sided) overlap and can be estimated via Double/Debiased Machine Learning. This allows to test whether the OLS estimand corresponds at least to one proper causal effect in a particular application. We test the relevance of the biases in replications of influential empirical studies. Additionally, we revisit and refine the result of Słoczyński (2022) that the OLS estimand lies between ATT and the ATU under a specific outcome model. We show that this does not generalize to the case without functional form assumptions and provide more general conditions under which the result of Słoczyński (2022) still holds.

Coauthors: Michael C. Knaus, University of Tübingen

W30: Outcome-Aware Spectral Feature Learning for Instrumental Variable Regression

Author: Jakub Wornbard, Gatsby Computational Neuroscience Unit, UCL

Abstract

We address the problem of causal effect estimation in the presence of hidden confounders using nonparametric instrumental variable (IV) regression. An established approach is to use estimators based on learned spectral features, that is, features spanning the top singular subspaces of the operator linking treatments to instruments. While powerful, such features are agnostic to the outcome variable. Consequently, the method can fail when the true causal function is poorly represented by these dominant singular functions. To mitigate, we introduce Augmented Spectral Feature Learning , a framework that makes the feature learning process outcome-aware. Our method learns features by minimizing a novel contrastive loss derived from an augmented operator that incorporates information from the outcome. By learning these task-specific features, our approach remains effective even under spectral misalignment. We provide a theoretical analysis of this framework and validate our approach on challenging benchmarks.

Coauthors: Dimitri Meunier : GCNU, UCL Vladimir R. Kostic : CSML, IIT Genoa & Faculty of Science, University of Novi Sad Antoine Moulin : Universitat Pompeu Fabra Alek Fröhlich : CSML, IIT Genoa Karim Lounici : CMAP-Ecole Polytechnique, Massimiliano Pontil: CSML, IIT Genoa & AI Centre, UCL Arthur Gretton : GCNU, UCL

W31: Difference-in-Differences in the Presence of Unknown Interference

Author: Javier Viviens, European Economics Institute

Abstract

The stable unit treatment value (SUTVA) is a crucial assumption in the Difference-in-Differences (DiD) research design. It rules out hidden versions of treatment and any sort of interference and spillover effects across units. Even if this is a strong assumption, it has not received much attention from DiD practitioners and, in many cases, it is not even explicitly stated as an assumption, especially the no-interference assumption. In this technical note, we investigate what the DiD estimand identifies in the presence of unknown interference. We show that the DiD estimand identifies a contrast of causal effects, but it is not informative on any of these causal effects separately, without invoking further assumptions. Then, we explore different sets of assumptions under which the DiD estimand becomes informative about specific causal effects. We illustrate these results by revisiting the seminal paper on minimum wages and employment by Card and Krueger (1994).

Coauthors: Fabrizia Mealli, European University Institute

W32: In Defense of the Pre-Test: Valid Inference when Testing Violations of Parallel Trends for Difference-in-Differences

Author: Jonas Mikhaeil, Department of Statistics, Columbia University, USA

Abstract

The difference-in-differences (DID) design is a key identification strategy which allows to estimate causal effects under the parallel trends assumption. While the parallel trends assumption is counterfactual and cannot be tested directly, researchers often examine pre-treatment periods to check whether the time trends are parallel before treatment is administered. Recently, researchers have been cautioned against using preliminary tests which aim to detect violations of parallel trends in the pre-treatment period. In this paper, we argue that preliminary testing should play an important role within the DID research design. We propose a new and more substantively appropriate conditional extrapolation assumption, which requires an analyst to conduct a preliminary test to determine whether the severity of pre-treatment parallel trend violations falls below an acceptable level before extrapolation to the post-treatment period is justified. Under mild assumptions, we provide a consistent preliminary test as well confidence intervals which are valid when conditioned on the result of the test. The conditional coverage of these intervals overcomes a common critique made against the use of preliminary testing for DID.

Coauthors: Christopher Harshaw (Department of Statistics, Columbia University, USA)

W33: Partial Identification Learning with Categorical Treatments for Individualised Treatment Rules

Author: Johannes Hruza, HEADS and AI, University of Copenhagen, Denmark

Abstract

We develop a partial identification learning framework for individualized treatment rules (ITRs) with categorical treatments, outcomes, and instrumental variables. Rather than relying on strong causal assumptions required for point identification, our framework uses causal bounds to find the optimal treatment decision. Existing methods are restricted to binary settings and the canonical instrumental variable design introduced by Balke and Pearl. We extend this framework to accommodate a broader class of causal structures as well as scenarios with categorical treatment, outcome, and instrumental variables. We introduce a generalized minimax regret criterion for treatment selection from among more than 2 options, which minimizes the maximum possible difference between the chosen and the optimal treatment based on partial identification bounds. To construct the ITR, we use a symmetric simplex embedding to avoid geometric inconsistencies of one-vs-rest approaches and derive a Fisher-consistent, differentiable surrogate loss. We establish finite sample convergence rates via an oracle inequality for a kernel-based implementation. Experiments show that our framework yields ITRs closer to the oracle ITR compared to existing alternatives.

Coauthors: Pawel Morzywolek Section of Biostatistics, University of Copenhagen, Copenhagen, Denmark Jakob Zeitler, Department of Statistics, University of Oxford, UK Samir Bhatt, HEADS and AI, University of Copenhage, Denmark Michael Sachs, Section of Biostatistics, University of Copenhagen, Copenhagen, Denmark

W34: A framework for sequential causal prediction

Author: Julia Whitman, Section of Biostatistics, University of Copenhagen, Denmark

Abstract

Many clinical decisions, such as starting or changing treatment, are based on predicted risk. However, standard prediction models are often unfit because they ignore the causal structure (e.g., confounders, colliders, mediators).

Interest in causal prediction (“prediction under interventions”) has grown rapidly, accompanied by substantial advances in methods for the development and validation of these models. To date, the literature on causal prediction and heterogeneous treatment effects has focused mainly on single time-point settings. We adapt this to sequential settings to aid long-term clinical decision-making that is adaptive and reflects point-in-time treatment decisions based on predicted risk.

In this work, we propose a framework for sequential causal prediction tailored to common complex diseases. Our approach combines a landmarking procedure to estimate individualized risks under interventions and compares the potential outcomes with other popular sequential decision-making approaches such as Q-learning. We study the properties of our framework through extensive simulations, including the influence of irregularly distributed time-points, and demonstrate its potential utility in a real-world data application.

Coauthors: Paweł Morzywolek - Section of Biostatistics, University of Copenhagen, Denmark Ruth Keogh - Department of Medical Statistics, London School of Hygiene and Tropical Medicine, United Kingdom Michael Sachs - Section of Biostatistics, University of Copenhagen, Denmark

W35: Finding treatment regimes with optimality guarantees of practical interest

Author: Julien David Laurendeau, Chair of Biostatistics, Institute of Mathematics, Swiss Federal Institute of Technology at Lausanne, Switzerland

Abstract

Finding optimal treatment regimes is a widely studied problem in the causal inference literature, with broad practical interest. However, there are two important issues, common to most existing approaches: First, assumptions grow stronger as we include more covariates and aim for finer personalization. Second, obtaining guarantees conditional on covariates is difficult and, even when such guarantees are given, the power of detecting significant effects is often low.

Here we suggest a way to handle these problems. Instead of targeting the conventional optimal regime, we aim for rigorous inference, with statisticial guarantees, on customized group-level effects, interpretable as coarsened versions of conventional optimal regimes. We give explicit frequentist guarantees that these groups differ in their effects. Finally, we show that, in realistic settings, these group-based strategies can outperform state-of-the-art optimal-regime methods, even when those methods are correctly implemented with modern doubly robust machine-learning estimators.

W36: Improving Longitudinal Targeted Maximum Likelihood Estimation in Target Trial Emulation using Joint Calibrated Weights

Author: Juliette Limozin, MRC Biostatistics Unit, University of Cambridge

Abstract

In target trial emulation (TTE), marginal structural models (MSMs) can be used to characterise per-protocol treatment effects over time. The MSM parameters are often estimated by inverse probability weighting (IPW), with weights estimated by maximum likelihood. However, the resulting MSM parameter estimates may be unstable in finite samples or when the weight model is misspecified. An alternative method for estimating the MSM parameters is targeted maximum likelihood estimation (LTMLE). This is double robust and potentially more efficient than IPW. However, it also relies on IP weights. We propose joint calibrated LTMLE, which integrates LTMLE with joint calibrated weights adapted for per-protocol effect estimation. This calibration of weights improves finite-sample performance by enforcing covariate balance for the treatment and censoring processes. Simulations show that the proposed method has improved efficiency and robustness to weight model misspecification, compared to standard LTMLE. We illustrate the method using a case study to evaluate the effect of highly active antiretroviral therapy on CD4 cell count among HIV-positive women.

Coauthors: Shaun R. Seaman & Li Su (MRC Biostatistics Unit, University of Cambridge)

W37: A Weighted Resampling Framework for Causal Transportability

Author: Jianqiao Mao, School of Computer Science, University of Birmingham

Abstract

Transferring well-studied causal knowledge to new environments and populations can accelerate scientific advancement by enabling reusable findings, i.e., the causal transportability. Recent work on causal transportability characterises when causal effects can be transferred across environments, but it emphasises the graphical criterion and symbolic transport formulas whose practical use requires nontrivial numerical computation and estimation, especially under high-dimensional variables and nested expressions. To bridge this gap, we propose transportable causal sampling, a general weighted resampling-based framework that compiles transport formulas into actionable target-domain deconfounding procedures. We derive causal transport weights from known causal relationships and partial observations of key variables in the new environment, enabling a family of weighted samplers to generate synthetic samples that approximate target-domain interventional distributions. This provides a practical pathway from transport formulas to sample-level intervention simulation and downstream applications such as causal machine learning model training. Empirically, we demonstrate effectiveness on synthetic and semi-synthetic benchmarks.

Coauthors: Jianqiao Mao, School of Computer Science, University of Birmingham; Max A. Little, School of Computer Science, University of Birmingham

W38: RieszBoost: Gradient Boosting for Riesz Regression

Author: Kaitlyn J. Lee, UC Berkeley, Division of Biostatistics

Abstract

Answering causal questions often involves estimating linear functionals of conditional expectations, such as average treatment effects or effects of longitudinal modified treatment policies. By the Riesz representation theorem, these functionals can be expressed as the expected product of the conditional expectation of the outcome and the Riesz representer, a key component in doubly robust estimation methods. Existing approaches typically estimate the Riesz representer indirectly by deriving its analytical form and substituting estimated components, a process that can be challenging and sensitive to practical positivity violations, resulting in higher variance and wider confidence intervals. We propose a novel gradient boosting algorithm that directly estimates the Riesz representer without requiring its explicit analytical form. The proposed method is well suited for tabular data and provides a flexible, nonparametric, and computationally efficient alternative to existing approaches. Simulation studies demonstrate that our method performs comparably to or better than indirect estimation techniques across a range of causal functionals, offering a robust and user-friendly solution for estimating causal quantities.

Coauthors: Alejandro Schuler (UC Berkeley, Division of Biostatistics)

W39: Extending inferences from a randomized trial to its nested population: a methodological illustration of Coronary Thrombus Aspiration

Author: Alessandro Bosi, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Sweden

Abstract

Background: Methods to extend trial findings to target populations have focused on intention-to-treat effects under full adherence. We illustrate extension of both estimands in the presence of treatment non-adherence.Methods: We used data from the TASTE randomized trial conducted within the Swedish SWEDEHEART registry. The trial enrolled 6,788 STEMI patients comparing thrombus aspiration during PCI versus PCI alone. The registry identified 4,115 additional eligible non-participants, creating a target population of 10,903 individuals. We estimated effects using augmented inverse probability weighting adjusted for clinical and socioeconomic variables. The outcome was 1-year mortality.Results: Intention-to-treat estimates were consistent across populations (trial risk difference: -0.08%, 95% CI: -1.12, 0.84; target: -0.14%, -1.26, 0.86). Per-protocol estimates also showed consistency (trial: 0.25%, -0.81, 1.21; target: 0.20%, -0.88, 1.24).Conclusions: Both intention-to-treat and per-protocol inferences can be extended to target populations. While intention-to-treat effects are protected by randomization, per-protocol effects require stronger assumptions but may be more transportable across populations with different adherence rates.

Coauthors: Alessandro Bosi, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; Conor J. MacDonald, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; Miguel A. Hernán, CAUSALab, Department of Epidemiology, Harvard T.H. Chan School of Public Health, and Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Anthony A. Matthews, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; Sarah Robertson, Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA; Ole Fröbert, Faculty of Health, Department of Cardiology, Örebro University, Sweden; Stefan James, Department of Medical Sciences, Uppsala Clinical Research Center, Uppsala University, Uppsala, Sweden; Juho Härkönen, Department of Sociology, Stockholm University, Stockholm, Sweden; Anna B.C. Humphreys, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; David Erlinge, Division of Cardiology, Department of Clinical Sciences Lund, Lund University, Lund, Sweden; Bertil Lindahl, Department of Medical Sciences, Uppsala University, Uppsala, Sweden; Tomas Jernberg, Department of Clinical Sciences, Danderyd Hospital, Division of Cardiovascular Medicine, Karolinska Institutet, Stockholm, Sweden; Maria Feychting, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; Anita Berglund, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; Camila Olarte Parra, Unit of Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden; Issa Dahabreh, CAUSALab, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Department of Biostatistics, Harvard T.H. Chan School of Public Health, and Richard A. and Susan F. Smith Center for Outcomes Research, Beth Israel Deaconess Medical Center, Boston, MA, USA.

W40: Causal Assessment of Interactions with High-Dimensional, Continuous-Valued Exposures Under Time-Varying Confounding

Author: Allison Codi, Department of Biostatistics and Bioinformatics, Emory University, USA

Abstract

Enteric infections are a major contributor to childhood illness and death, particularly in low-resource settings where children are exposed to multiple pathogens. Understanding whether multiple pathogens interact to produce worse clinical outcomes is critical for prioritizing prevention strategies. Evaluating such interactions is challenging as pathogen burdens are continuous-valued, high-dimensional, and can be confounded by antibiotic treatment decisions—a major determinant of outcomes for bacterial infections. To address these challenges, we propose a framework to characterize pathogen interactions by mimicking factorial challenge trials. Our estimands are defined by interventions that draw quantities for each pathogen from a specified distribution and compare regimes in which one or more pathogens are shifted to a high-level distribution while remaining pathogens are shifted to a low-level distribution. The estimands allow the use of doubly robust, machine-learning–based estimators that flexibly learn interaction effects. We apply the methods to enteric infection data from five large multi-site studies and discuss implications for intervention targeting and clinical management.

Coauthors: Elizabeth Rogawski McQuade, Department of Epidemiology, Emory University, USA David Benkeser, Department of Biostatistics and Bioinformatics, Emory University, USA

W42: Selection Bias due to Causal Effect Heterogeneity in Instrumental Variable Analyses

Author: Apostolos Gkatzionis, MRC Integrative Epidemiology Unit, University of Bristol, United Kingdom

Abstract

Selection bias is a common concern in causal inference. Two mechanisms can give rise to selection bias: stratifying on a collider of the exposure and outcome, and stratifying on an effect modifier of the exposure-outcome causal effect. Research into the impact of selection bias on instrumental variable (IV) analyses has focused primarily on collider bias.

Here, we show that IV estimates of the average causal effect (ACE) are biased if selection (or a cause of selection) is an effect modifier for the ACE. This can happen even if the “no simultaneous heterogeneity” assumption, which is required to identify the ACE, is satisfied. In a simulation study, we investigate the size and direction of the bias when estimating effects on the mean difference or odds ratio scale, assuming effect modification by either a confounder of the IV exposure and outcome or an unrelated variable. We also investigate bias due to heterogeneity in the instrument-exposure relationship and show that it does not affect the standard two-stage least squares estimator due to its robustness to misspecification of the first-stage model. Finally, we discuss diagnostics and methods to deal with the bias in practice, including inverse probability weighting.

Coauthors: Fernando Pires Hartwig (Federal University of Pelotas, Brazil), Kate Tilling (MRC Integrative Epidemiology Unit, University of Bristol, UK)

W43: Value-of-information for causal questions: a bridge between randomised controlled trials and real-world evidence

Author: Brieuc Lehmann, Statistical Science, UCL, UK

Abstract

Randomised controlled trials (RCTs) and real-world evidence (RWE) have complementary strengths and limitations for causal inference, with RCTs offering internal validity through randomisation and RWE typically providing larger and more representative samples at lower cost. This work explores how a Bayesian decision theoretic perspective can serve as a bridge between these two evidence paradigms. Value-of-Information (VoI) quantifies the expected benefit of reducing uncertainty and has been widely adopted in health economics to inform whether, and how, additional data collection should occur. The same principles can be extended to causal questions, allowing formal assessment of the information yield and decision value of both randomised and observational study designs. Here, we investigate how VoI can be used to optimise aspects of RCT design, such as the distribution of effect modifiers in the study population, and to inform the design and prioritisation of RWE studies under varying degrees of confounding and effect heterogeneity. We show using simulation-based examples how this unified approach enables explicit, quantitative comparisons of RCT and RWE designs in terms of their expected contribution to causal decision-making.

W44: High dimensional exposure and positivity violation: benefits of stochastic counterfactual scenarios and extrapolation using marginal structural models

Author: Benoît Lepage, CERPOP (EQUITY team), University of Toulouse, Inserm UMR1295, France

Abstract

Identifiability of causal effects can be limited in longitudinal studies with repeated exposures, especially in case of high dimensionality of the exposures and confounders, resulting in positivity violation and decreased performances of inverse probability of treatment weighting (IPTW) or targeted maximum likelihood (TMLE) estimators. As an example, we try to estimate a controlled direct effect in the British NCDS 58 cohort: the direct effect of adverse childhood experiences on the risk of death at age 55, for controlled values of mediators (obesity, smoking, alcohol) throughout life. We aim to assess the performances of IPTW and TMLE estimators by applying the following strategy: (1) Describe the causal estimand in terms of contrasts between several “stochastic” regimes (varying the probability of exposure to the mediators) and use parameters of a marginal structural model (MSM) to express the relationships between the potential outcomes and the regimes. (2) Estimate the MSM parameters using the scenarios best supported by the data. (3) Extrapolate the counterfactual results for scenarios with weak data support using the MSM parameters. This strategy is explored using simulations and is illustrated within the NCDS 58 cohort.

Coauthors: Camille Joly (1), Lola Neufcourt (1), Cyrille Delpierre (1) ; 1. CERPOP (EQUITY team), University of Toulouse, Inserm UMR1295, France

W45: The Generalised Kernel Covariance Measure for Causal Discovery with Mixed Data

Author: Luca Bergen, Leibniz Institute for Prevention Research and Epidemiology, Bremen, Germany

Abstract

Constraint-based methods for causal discovery rely on conditional independence (CI) tests. To be useful, CI tests should be broadly applicable, e.g., to mixed variable types, with good Type I error control and power. The Generalised Covariance Measure (GCM; Shah and Peters, 2020) has uniform level guarantees and handles mixed data but lacks power, detecting only alternatives with E[Cov(X,Y | Z)] =/ 0.

We present the Generalised Kernel Covariance Measure (GKCM), a non-parametric CI test using kernel feature maps and arbitrary output-kernel regression models. GKCM is defined as a Generalised Hilbertian Covariance Measure (Lundborg et al., 2022), a high-dimensional analogue of GCM. We state conditions for uniform level guarantees for GKCM, unlike existing kernel CI tests (e.g., Zhang et al., 2011). GKCM tests if E[Cov(f(X), g(Y) | Z)] = 0 for large sets of transformations f,g, thus improving power over GCM. Suitable kernels also allow for mixed data. This makes GKCM a versatile and reliable method for constraint-based Causal Discovery. We illustrate its performance against state-of-the art methods in simulations with mixed data and non-linear dependencies.

Coauthors: Vanessa Didelez, Leibniz Institute for Prevention Research and Epidemiology, Bremen, Germany

W46: Empirical Calibration using Negative Control Outcomes to improve causal estimation: Replication of the ProtecT trial using UK linked real world data from primary care, hospitals, and cancer registry

Author: Cecilia Campanile, NDORMS, University of Oxford, Oxford, UK.

Abstract

Evaluation of replication fidelity of target trial emulations remains limited. We emulated the Prostate Testing for Cancer and Treatment (ProtecT) trial comparing prostatectomy and radiotherapy for localised, early-stage prostate cancer using linked UK primary care, hospital, cancer registry data. The emulated trial aligned eligibility criteria, time zero, follow-up, and estimands with ProtecT. Exchangeability was approximated by matching patients 1:1 on propensity scores (caliper 0.2), year, age, and cancer stage. Hazard Ratios (HR) were estimated using Cox regression. We applied negative control outcome-based empirical calibration, using outcomes unrelated to treatment to assess residual confounding and adjust confidence intervals (CI). Replication was evaluated by comparing calibrated and uncalibrated HR with those reported in ProtecT for all-cause mortality, prostate cancer mortality, and metastasis. Agreement was assessed using metrics proposed in the RCT-Duplicate initiative: estimate agreement, log-scale bias, and standardised differences in HR. Empirical calibration didn’t improve estimate agreement but reduced absolute log-scale bias and standardised differences, with the greatest improvement seen for all-cause mortality.

Coauthors: Eng Hooi Tan1; Marti Catala Sabate1; Danielle Newby1; Antonella Delmestri1; Nora Tabea Sibert3; Rossella Nicoletti4,5; Peter-Paul Willemse6; Daan Nieboer 7; Alvar Rosello Serrano8; Daniel Prieto-Alhambra1,2; OPTIMA consortium 1Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK. 2Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, The Netherlands. 3Health Services Research Department, German Cancer Society, Berlin, Germany + Oncological Health Services Research, University Clinic Düsseldorf, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany 4Department of Experimental and Clinical Biomedical Science, University of Florence, Florence, Italy. 5S.H.Ho Urology Centre, Department of Surgery, The Chinese University of Hong Kong, Hong Kong, China. 6Department of Urology, Cancer Center, UMC Utrecht, The Netherlands. 7Erasmus University Medical Centre, Cancer Institute, Department of Urology, Rotterdam, The Netherlands. 8Institut Català d’Oncologia, Hospital Universitari Dr Josep Trueta, Girona, Spain

W47: Do doubly robust estimation methods preserve their doubly robust property under multivariate missingness?

Author: Christoph Wiederkehr, Department of Statistics, LMU, Germany

Abstract

Doubly robust (DR) estimators for the Average Treatment Effect (ATE), such as AIPW and TMLE, remain consistent in fully observed settings when either the outcome or the treatment mechanism are estimated consistently. When covariates, exposure, and outcome are jointly missing, recoverability of the ATE becomes central. Using three missingness-directed acyclic graphs (m-DAGs), we show that DR properties extend to this setting if and only if both the outcome regression and the marginal distribution of the covariates, the two components of the g-formula, are recoverable from the observed-data distribution. Consequently, the conditional treatment mechanism is not required to be recoverable. Even when the treatment assignment induces its own missingness (i.e., a MNAR case), and the treatment mechanism is itself non-recoverable, consistent estimation of the treatment model alone suffices for consistency of DR estimators of the ATE. In contrast, when the outcome induces its own missingness, recoverability fails and DR estimators exhibit persistent bias. A simulation study confirms these theoretical findings and highlights recoverability as the key condition for preserving double robustness under multivariate missingness.

Coauthors: Michael Schomaker: Department of Statistics, LMU, Germany and Centre for Integrated Data and Epidemiological Research, Cape Town, South Africa

W48: A Bayesian State-Space Approach with Dynamic Covariates for Disentangling Anticipatory and Intervention Effects

Author: Damiano Baldaccini, University of Florence (Italy)

Abstract

Evaluating public policies is particularly challenging when their nationwide scope precludes the existence of a suitable control group and when advance announcements generate anticipatory effects that must be disentangled from the effects of the policy itself. We address these challenges within the potential outcomes framework. We formally define the causal estimands of interest, explicitly distinguishing between the effects of the policy announcement and those of the policy implementation. We then introduce a set of identifying assumptions under which these causal effects can be estimated using a Bayesian state-space model that exploits the time-series structure of the data. The proposed model is estimated over the full time horizon and incorporates a dynamic treatment variable that captures the exogenous shocks induced by both the policy announcement and the subsequent intervention. Through a series of simulation studies, we assess the performance of our approach and compare it with standard counterfactual forecasting methods for causal inference in the absence of a control group. Finally, we apply the proposed methodology to evaluate an Italian transportation policy that incentives green vehicles and penalizes polluting ones.

Coauthors: Alessandra Mattei (University of Florence, Italy), Fiammetta Menchetti (University of Florence, Italy)

W101: Bayesian Proximal Causal Inference: Geometry and Identifiability

Author: Francesco Di Giuseppe, DIAM-Statistics, TU Delft, The Netherlands

Abstract

Proximal inference enables identification of causal parameters by imposing conditional independence assumptions, which under parametric models translate into geometric constraints on the parameter space. Although the full parameter may not be identifiable due to missing data, depending on the dimensions of the variables involved, the observational parameters may be restricted to subsets of the parameter space. We study these constraints in the case that all variables are discrete and consider Bayesian inference on parameters describing the joint distribution of the observed and latent variables. The Bayesian procedure automatically restricts the observable model parameters to satisfy the geometric constraints. We prove that in the large sample limit the posterior distribution for the full parameter settles to a nondegenerate, prior-dependent distribution. For the identifiable components, including the interventional distribution and causal mean, the induced posterior satisfies a Bernstein-von Mises theorem. The procedure allows for automatic uncertainty quantification for the identifiable parameters. Simulation studies using a Gibbs sampler accompany the theoretical results.

Coauthors: Francesco Gili (DIAM-Statistics, TU Delft), Aad van der Vaart (DIAM-Statistics, TU Delft)

W102: When Methods Matter: Estimands and Design in Target Trial Emulation

Author: Fang Li, Department of Psychiatry, University of Oxford, UK

Abstract

Target trial emulation (TTE) is widely used for causal inference from observational data, yet the sensitivity of results to design and analytic choices remains underexplored. We emulated a target trial comparing first-line depression treatments using electronic health records, selecting a clinical contrast with expected near-null effects to isolate methodological variation.

We varied covariate adjustment, missing data handling, weighting strategy (entropy balancing and stabilized IPTW targeting the average treatment effect [ATE] versus propensity score matching targeting the average treatment effect among the treated [ATT]), and outcome modeling. Across 627 estimates were broadly consistent, weighting and matching choices introduced substantial variability, partly due to differences in effective sample size and estimand definition, whereas modeling and missing data choices had smaller effects.

Together, these results show that even when true treatment effects are expected to be small, reasonable analytic choices within a TTE can materially influence treatment effects, underscoring the need for transparent reporting and systematic sensitivity analyses.

Coauthors: Sophie Shang, Fang Li, Max Taquet *denotes co-first authors

W103: Many Experiments, Few Repetitions, and Unpaired Data: Is Causal Inference Possible?

Author: Felix Schur, Seminar for Statistics, ETH Zurich, Switzerland

Abstract

We study the problem of estimating causal effects in the following unpaired data setting: we observe some covariates \(X\) and an outcome \(Y\) under different experimental conditions but do not observe them jointly – we either observe \(X\) or \(Y\). Under appropriate regularity conditions, the problem can be cast as an instrumental variable (IV) regression with the environment acting as a (possibly high-dimensional) instrument. When there are many environments but only a few observations per environment, standard two-sample IV estimators fail to be consistent. We propose a GMM-type estimator based on cross-fold sample splitting of the instrument–covariate sample and prove that it is consistent as the number of instruments grows with sample size but the sample size per environment remains constant. We further extend the method to sparse causal effects via \(\ell_1\)-regularized estimation and post-selection refitting. Experiments on synthetic data demonstrate that our approach corrects the measurement-error bias and yields accurate estimation and valid uncertainty quantification in unpaired settings.

Coauthors: Jonas Peters (ETH Zurich), Niklas Pfister (Lakera AI), Peng Ding (UC Berkeley), Sach Mukherjee (DZNE Bonn)

W104: Balancing Spatial Confounding and Spillovers in Synthetic Control Designs

Author: Giulio Grossi, University of Florence

Abstract

In policy evaluations involving administrative units like cities or regions, interventions often generate spillovers to connected areas, violating the no-interference assumption. Simultaneously, spatially correlated unobserved confounders frequently affect both treatment and outcomes. Synthetic control methods thus face a fundamental trade-off: using distant donors induces spatial confounding bias, while relying on nearby donors risks spillover contamination. We propose a design-based framework to diagnose and quantify this tension. Adopting an “inclusive by default” strategy, we retain all potential donors rather than assuming the existence of pure controls, introducing indices to measure the spillover exposure of synthetic counterfactuals. We treat exposure mappings, based on geography or networks, as objects of weak identification, assessed through pre-treatment diagnostics. Using simulations, we characterize regimes where spatial confounding or spillover bias dominates, demonstrating how design choices critically shape causal conclusions. We illustrate the framework by evaluating the impact of the renewal of an ancient pilgrimage road in Tuscany on tourist flows.

W105: Causal Mediation Analysis Under Effect Heterogeneity

Author: Hanna Kim, Psychology Department, University of California, Santa Cruz, United States

Abstract

When a treatment operates through an indirect pathway by influencing a mediator, individuals may differ in the mediation effects they experience. Despite this, standard causal mediation analyses typically marginalize over individual-level mediation effects, assuming no heterogeneity or only coarse differences across observed groups. We examine how mediation effects vary across observed groups and unobserved subpopulations, and assess the consequences of misspecifying this heterogeneity. Using a simulation study, we evaluate two existing mediation approaches and show that ignoring mediation effect heterogeneity can lead to substantial bias, with estimates that are not arithmetic averages of heterogeneous individual effects. Allowing for heterogeneity when mediation effects are homogeneous yields nearly unbiased estimates with modest efficiency loss. In contrast, modeling heterogeneity across observed groups in the presence of latent heterogeneity produces group-specific effects that do not correspond to effects in latent subpopulations. We conclude by discussing diagnostic and data-driven strategies for detecting and modeling mediation effects under unobserved heterogeneity.

Coauthors: Jee-Seon Kim, University of Wisconsin-Madison

W106: Identifying Causal Effects using Proximal Time Series: Correcting for the Past in VARs with Proxies

Author: Hubert Drazkowski, University of Copenhagen, Denmark

Abstract

Proximal variables (PV) estimates effects from observational data in the presence of unobserved confounding. We consider a PV time series analogue in which only a single trajectory of VAR is observed. We allow all variables to be continuous time-varying and multivariate. In this setting, confounding propagates through autoregressive dynamics: shared history induces memory effects that invalidate a direct i.i.d. application of PV. We introduce PVAR, a structural restriction on VAR coefficient matrices that formalises proxies in multivariate time series. For PVAR, we derive two families of identifying equations that account for temporal dependence. The first estimator corrects for the most recent past via appropriate lagged conditioning. The second allows to reach back further in time, however, gives rise to a setting with hinges-autoregressive variables that act as collider-mediators and violate cross-side exclusion. We deal with their presence by adapting Nuisance PV to the time-series setting. In addition, we provide explicit dynamic identifiability conditions that parallel recent results for IV time series. We establish consistency of the corresponding estimators and study their robustness with experiments.

W107: Randomization inference for stepped-wedge designs with noncompliance with application to a palliative care pragmatic trial

Author: Jeffrey Zhang, Data Science Institute, University of Chicago, United States

Abstract

While palliative care is increasingly commonly delivered to hospitalized patients with serious illnesses, few studies have estimated its causal effects. Courtright et al. (2016) adopted a cluster-randomized stepped-wedge design to assess the effect of palliative care on a patient-centered outcome. The randomized intervention was a nudge to administer palliative care but did not guarantee receipt of palliative care, resulting in noncompliance (compliance rate ≈ 30%). A subsequent analysis using methods suited for standard trial designs produced statistically anomalous results, as an intention-to-treat analysis found no effect while an instrumental variable analysis did (Courtright et al., 2024). This highlights the need for a more principled approach to address noncompliance in stepped-wedge designs. We provide a formal causal inference framework for the stepped-wedge design with noncompliance by introducing a relevant causal estimand and corresponding estimators and inferential procedures. Through simulation, we compare estimators across a range of stepped-wedge designs and provide practical guidance in choosing an analysis method. Finally, we apply our recommended methods to reanalyze the trial of Courtright et al. (2016).

Coauthors: Zhe Chen, University of Pennsylvania; Katherine Courtright, University of Pennsylvania; Scott Halpern, University of Pennsylvania; Michael Harhay, University of Pennsylvania; Dylan Small, University of Pennsylvania; Fan Li, Yale University

W108: Estimation of the interventional absolute risk under treatment-assigned-at-visit interventions in continuous-time

Author: Johan Sebastian Ohlendorff, Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark

Abstract

In medical research, causal effects of time‐varying treatments are often defined using an emulated target trial. We focus on estimands defined as contrasts of absolute risk of an outcome occurring before a fixed time horizon under prespecified treatment regimens. Most estimators for observational data rely on discretizing time but are sensitive to the chosen scale and may not target the desired causal effect. A recently proposed continuous‐time framework (Rytgaard et al., 2022) preserves exact event times. In this talk, I will discuss a computationally feasible sequential regression-type estimator for the continuous-time framework, which estimates the nuisance parameter models by backtracking through the number of events. This estimator allows for efficient, single-step targeting using machine learning methods from survival analysis and regression, and regresses multi-type event outcomes on the available history at the 𝑛’th event time per person, enabling robust continuous-time causal effect estimation. Finally, we illustrate the applicability of our method using electronic health records data.

Coauthors: Johan Sebastian Ohlendorff (Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark), Anders Munch (Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark), Thomas Alexander Gerds (Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark)

W109: Identifying Biased Beliefs as a Source of Disparate Decisions

Author: John Körtner, Collegio Carlo Alberto, Italy

Abstract

Bureaucrats often allocate services disparately across groups. To understand such disparities for service provisions where eligibility is uncertain, it is indispensable to not only disentangle preference-based from statistical discrimination but also to identify biased beliefs about eligibility to parse out inaccurate statistical discrimination. In this article, I clarify the role of biased beliefs and provide an empirical test using administrative data from the Swiss unemployment insurance system, where caseworkers were required to record subjective assessments of claimants’ employability. I estimate belief bias as the difference in assessments across nationality groups that cannot be justified by actual re-employment outcomes. To solve the challenge that re-employment outcomes are contaminated by caseworkers’ decisions, I develop a novel strategy based on debiased machine learning, combining augmented inverse probability weighting with categorical boosting. I find systematic belief bias against non-Swiss claimants, particularly those from the MENA region and Sub-Saharan Africa. Accounting for the contamination by caseworkers’ program assignment decisions reduces the initial bias but only explains a small part of it.

Coauthors: -

W110: Price Optimization Combining Conjoint Data and Purchase History: A Causal Modeling Approach

Author: Juha Karvanen, Department of Mathematics and Statistics, University of JYväskylä, Finland

Abstract

Pricing decisions of companies require an understanding of the causal effect of a price change on the demand. When real-life pricing experiments are infeasible, data-driven decision-making must be based on alternative data sources such as purchase history (sales data) and conjoint studies where a group of customers is asked to make imaginary purchases in an artificial setup. We present an approach for price optimization that combines population statistics, purchase history, and conjoint data in a systematic way. We build on the recent advances in causal inference to identify and quantify the effect of price on the purchase probability at the customer level. The identification task is a transportability problem whose solution requires a parametric assumption on the differences between the conjoint study and real purchases. The causal effect is estimated using Bayesian methods that take into account the uncertainty of the data sources. The pricing decision is made by comparing the estimated posterior distributions of gross profit for different prices. The approach is demonstrated with simulated data resembling the features of real-world data.

Coauthors: Lauri Valkonen (University of Jyväskylä), Santtu Tikka (University of Jyväskylä), Jouni Helske (University of Jyväskylä and University of Turku)

W111: Causal mediation analysis for a survival outcome with longitudinal mediators, time-varying confounders and left truncation

Author: Katherine Holdsworth, Medical Statistics, London School of Hygiene & Tropical Medicine, London, United Kingdom

Abstract

Females with cystic fibrosis (CF) have worse survival than males, but the mechanisms behind this are not well understood. We used causal mediation analyses to investigate the extent to which the link between sex and survival is mediated through trajectories of disease progression markers. We used data from the UK CF Registry, which has longitudinal health status measures and mortality data on over 11,000 people since 1996. Some people are observed from birth, however a specific challenge was left-truncation due to people joining the data at older ages. We show how left-truncated data can be used in longitudinal mediation analysis for survival outcomes. The focus was on estimating the survival curve for females if their longitudinal mediator trajectory were shifted to be similar to that of males. We adapted the path-specific effects approach of Vansteelandt et al. 2019 to handle left-truncation, including by controlling for birth year and allowing the mortality hazard to depend on mediator and confounder histories only via more recent values. With lung health trajectory as the mediator we found that the female survival probability at age 30 would be 3.3 percentage points higher if their trajectory had been shifted to that of males.

Coauthors: Stijn Vansteelandt, Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium Diana Bilton, National Heart & Lung Institute, Imperial College, London, United Kingdom Nicholas Simmonds, Adult Cystic Fibrosis Centre, Royal Brompton Hospital & Imperial College, London, United Kingdom Ruth Keogh, Medical Statistics, London School of Hygiene & Tropical Medicine, London, United Kingdom

W113: A L-infinity Norm Counterfactual and Synthetic Control Approach

Author: Le Wang, AAEC/Virginia Tech, USA

Abstract

This paper reinterprets the Synthetic Control (SC) framework through the lens of weighting philosophy, arguing that the contrast between traditional SC and Difference-in-Differences (DID) reflects two distinct modeling mindsets: sparse versus dense weighting schemes. Rather than viewing sparsity as inherently superior, we treat it as a modeling choice - simple but potentially fragile. We propose an L-infinity regularized SC method that combines the strengths of both approaches. Like DID, it employs a denser weighting scheme that distributes weights more evenly across control units, enhancing robustness and reducing overreliance on a few control units. Like traditional SC, it remains flexible and data-driven, increasing the likelihood of satisfying the parallel trends assumption while preserving interpretability. We develop an interior point algorithm for efficient computation, derive asymptotic theory under weak dependence, and demonstrate strong finite-sample performance through simulations and real-world applications.

Coauthors: Youhui Ye, Xin Xing (Virginia Tech)

W114: Challenges in Emulating a Target Trial with Time-to-Event Outcome in the Federalized Privacy-Preserving Infrastructure DataSHIELD: A Case Study evaluating Screening Colonoscopy

Author: Nadja Lendle, Leibniz Institute for Prevention Research and Epidemiology – BIPS

Abstract

Access to distributed observational data in epidemiology offers opportunities for federated analysis which increase sample size and power while ensuring a high level of data security. No Target Trial Emulation (TTE) with time-to-event outcome using sequential trials has yet been conducted for addressing causal questions in a federated setting. In particular, neither parametric nor non-parametric federated weighted Aalen–Johansen estimators have been applied. The challenge in implementing a federated TTE is that individual level data is protected and only aggregated information can be used centrally to calculate pooled results. We provide suitable methodological approaches for federated TTE with time-to-event outcome in a privacy-preserving federated way with little loss to efficiency or bias. In addition, we show that the required implementation of a TTE is feasible in a privacy-preserving way using DataSHIELD. This R-based architecture for federated analysis allows to set conditions under which data are protected individually for each analyst. For illustration of a federated TTE, we evaluate the effectiveness of screening colonoscopy using health claims data stored on four separate servers but analyzed jointly.

Coauthors: Timm Intemann (Leibniz Institute for Prevention Research and Epidemiology – BIPS), Vanessa Didelez (Leibniz Institute for Prevention Research and Epidemiology – BIPS and Faculty of Mathematics & Computer Science, University of Bremen)

W115: Bayesian latent factor models with time-varying loadings for causal inference in the presence of unobserved confounding

Author: Luke Hardcastle, MRC Biostatistics Unit, University of Cambridge, UK

Abstract

Factor models are a popular tool in causal inference. When multiple units are measured at multiple time points they allow for unmeasured confounding to be accounted for via the product of a vector of time specific factors and a vector of unit specific factor loadings. A key assumption of this approach is that the unit specific loadings remain constant in time, implying the same between-unit covariance structure at all time points. When this assumption fails, estimates of causal effects will be biased. We propose a novel causal factor analysis model that incorporates a time-varying factor loadings matrix, allowing the between-unit covariance structure to vary with time. We utilise tensor decomposition techniques to retain a parsimonious and low-dimensional representation of the time-varying factor loadings, with B-splines used to ensure the evolution in time is smooth. Further, an extension of the Dirichlet-Laplace shrinkage prior is proposed that jointly infers both the number of factors in the model and the rank of the tensor decomposition. An efficient Gibbs sampler is derived for posterior computation and the method is used to assess the impact of German reunification on per-capita GDP.

Coauthors: Arkaprava Roy, Department of Biostatistics, University of Florida; Pantelis Samartsidis, MRC Biostatistics Unit, University of Cambridge

W116: Key Challenges in Partial Identification and Causal Bounds: The past, the future and the present

Author: Jakob Zeitler, Department of Statistics

Abstract

Partial identification (causal bounding) is an emerging method of sensitivity analysis in causal inference. Instead of aiming for full identification, partial identification provides a lower and upper bound on the effect of interest. These bounds are based on a weaker set of assumptions in comparison to a full identification procedure, making them in general easier to justify.

Despite recent progress, challenges remain. Partial identification has not been studied sufficiently in continuous spaces and does not scale well to larger causal graphs. This translates to limited uptake in applications where the value to the practitioner is often still opaque. For example, in many randomized trials with non-compliance, bounds often straddle zero, i.e. [−0.15, +0.35]. What should a physician or policymaker do? Deny? Approve? Defer?

We discuss the latest theoretical challenges, and how they relate to adoption challenges of partial identification. Partial identification has the potential to significantly improve sensitivity analysis of causal inferences, but this will require increased efforts in lowering the barriers to access, across theory, applications, software and case studies.

W121: Refining the Notion of No Anticipation in Difference-in-Differences Studies

Author: Marco Piccininni, Institute of Mathematics, École Polytechnique Fédérale de Lausanne, Switzerland

Abstract

We address an ambiguity in identification strategies using difference-in-differences, which are widely applied in empirical research, particularly in economics. The assumption commonly referred to as the “no-anticipation assumption” states that treatment has no effect on outcomes before its implementation. However, because standard causal models rely on a temporal structure in which causes precede effects, such an assumption seems to be inherently satisfied. This raises the question of whether the assumption is repeatedly stated out of redundancy or because the formal statements fail to capture the intended subject-matter interpretation. We argue that confusion surrounding the no-anticipation assumption arises from ambiguity in the intervention considered and that current formulations of the assumption are ambiguous. Therefore, new definitions and identification results are proposed.

Coauthors: Eric J. Tchetgen Tchetgen (Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, USA); Mats J. Stensrud (Institute of Mathematics, École Polytechnique Fédérale de Lausanne, Switzerland)

W122: Nonparametric efficient estimation of the longitudinal front-door functional

Author: Marie Breum, Section of Biostatistics, University of Copenhagen, Denmark

Abstract

The front-door criterion is an identification strategy for the intervention-specific mean outcome in settings where the standard back-door criterion fails due to unmeasured exposure-outcome confounders, but an intermediate variable exists that completely mediates the effect of exposure on the outcome and is not affected by unmeasured confounding. The front-door criterion has been extended to the longitudinal setting, where exposure and mediator are measured repeatedly over time. However, to the best of our knowledge, applications of the longitudinal front-door criterion remain unexplored. This may reflect both limited awareness of the method and the absence of suitable estimation techniques. In this work, we propose nonparametric efficient estimators of the longitudinal front-door functional. The estimators accommodate high-dimensional mediators, are multiply robust, and allow for the use of data-adaptive methods for estimating nuisance functions while still providing valid inference. The theoretical properties of the estimators are showcased in a simulation study, and we apply the estimators to a trial of peanut allergy in infants.

Coauthors: Helene CW Rytgaard(a), Torben Martinussen(a), and Erin E Gabriel(a, b) (a) Section of Biostatistics, University of Copenhagen, Denmark (b) The Pioneer Centre for SMARTbiomed, University of Copenhagen, Denmark

W123: Targeted learning for estimating heterogeneous treatment effects using time-to-event data with competing risks

Author: Matthew Pryce, Dept of Statistical Science, UCL, England

Abstract

When exploring heterogeneous treatment effects, causal machine learning has grown in popularity, allowing researchers to use flexible estimation techniques and learn from complex, high-dimensional data. Major advances have been made in developing causal machine learning estimators for time-to-event outcomes, including causal survival forests and, more recently, surv-iTMLE, a targeted learning based estimator. However, important practical gaps remain, particularly their ability to appropriately account for competing events. Competing events arise when alternative outcomes preclude the event of interest from occurring, and if not accounted for, can mislead causal conclusions. In this work we show how surv-iTMLE, which estimates the differences in conditional survival probabilities, can be extended to handle competing events. We outline the derivation of this extension, and present simulations demonstrating its performance relative to naïve approaches that ignore competing events. We also emphasise the practical implementation of this estimator, providing guidance on modelling choices required to fit the estimator, as well as the choices required to obtain interpretable estimates.

Coauthors: Karla Diaz Ordaz, Dept of Statistical Science, UCL, England

W124: Individual-level uncertainty of causal predictions: introducing the causal effective sample size

Author: Doranne Thomassen, Department of Biomedical Data Sciences, Leiden University Medical Center, The Netherlands

Abstract

Causal prediction algorithms have been proposed to support individual decision making. In view of algorithm trustworthiness, uncertainty around these predictions and the plausibility of underlying assumptions should be evaluated; and made transparent to end users.

We developed the causal effective sample size (CESS), expressing causal prediction uncertainty as the number of similar individuals that the predicted outcome is effectively based on. We explored connections between CESS and the assumptions of exchangeability and positivity, and developed methods to estimate CESS for causal predictions derived from several causal learners. Our methods were applied to a medical dataset, leading to a prototype of CESS in diabetes care.

Adjustment variables affected CESS, thereby relating to the causal assumptions. Between-learner differences in CESS pointed to tradeoffs between modeling assumptions and positivity. We observed large between-individual differences in CESS in the diabetes prototype.

CESS contributes to an individual-level evaluation of prediction uncertainty and causal assumptions, improving the transparency of causal predictions.

Funding: Dutch Research Council (NWO) Grant ID 10.61686/DFECP93059.

Coauthors: Daniala Weir[2] Marleen Kunneman[3,4] Nan van Geloven[1]

[1] Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands [2] Division of Pharmacoepidemiology and Clinical Pharmacology, Department of Pharmaceutical Sciences, Utrecht University, Utrecht, The Netherlands [3] Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, The Netherlands [4] Knowledge and Evaluation Research Unit, Mayo Clinic Rochester, Rochester, MN, USA

W125: Algorithmic syntactic causal identification

Author: Dhurim Cakiqi, Kodamai

Abstract

Causal identification in causal Bayes nets (CBNs) is an important tool in causal inference allowing the derivation of interventional distributions from observational distributions where this is possible in principle. However, most existing formulations of causal identification using techniques such as d-separation and do-calculus are expressed within the mathematical language of classical probability theory on CBNs. However, there are many causal settings where probability theory and hence current causal identification techniques are inapplicable such as relational databases, dataflow programs such as hardware description languages, distributed systems and most modern machine learning algorithms. We show that this restriction can be lifted by replacing the use of classical probability theory with the alternative axiomatic foundation of symmetric monoidal categories. This allows a purely syntactic algorithmic description of general causal identification by a translation of recent formulations of the general ID algorithm through fixing.

Coauthors: Max A Little, University of Birmingham

W126: Causal Discovery Grounding and the Naturalness Assumption

Author: Ricardo Silva, Statistical Science, UCL

Abstract

Consider the case where data is collected across multiple sites, with a subset of them being targeted for a randomised trial. The task is to estimate causal effects where trials did not take place, using observational data. We would like to correct erroneous implications of assumptions that exploit observational data by using randomised sites. While a correction function could be learned, site-to-site variability compromise how effective this transfer can be. We leverage causal discovery to propose a complementary alternative: when applying causal discovery to the observational sample from a randomised site to estimate a causal adjustment set, we provide a method to quantify how violations of faithfulness might explain possible mismatches. We then transfer effect bounds implied by violations of faithfulness to get conservative bounds of causal effects on test sites. Assuming an implicit distribution of problems across sites, this relaxes measure-theorical justifications of faithfulness into an empirically calibrated “naturalness” assumption to describe what it means for faithfulness to fail. We discuss how this can lead to more conservative claims on the validity of causal discovery and its uses in causal effect estimation.

Coauthors: Jake Fawkes (UCL), Zikai Shen (UCL), Andreas Koukorinis (UCL), Jordan Penn (KCL), David Watson (KCL)

T01: Rescue when causal inference fails: formalizing causal triangulation as a reasoning process based on formal logical systems and causal assumptions

Author: Keling Wang, Department of Epidemiology, Erasmus MC University Medical Center Rotterdam, The Netherlands

Abstract

Causal triangulation aims to learn about a target causal estimand by comparing multiple studies targeting the same or similar estimands with distinct identification strategies. It is useful when all studies have questionable assumptions, and allows us to learn more about the target than had we considered each study in isolation.

We formalize causal triangulation as a reasoning process that relies on formal logical systems and takes causal identification studies and falsification strategies as inputs. This reasoning aims to answer a triangulation query with specific goals, and requires different types of logical systems independent of questionable assumptions.

We discuss two types of queries, their related propositional and probabilistic logical systems, and the reasoning based on them: (1) estimation, that only requires a system with deductive rules and unverifiable premises; and (2) explanation, that requires a system allowing abductive reasoning and often probabilistic. This formalization works under different causal models, provided they are complete and sound. We provide two querying and reasoning approaches that lead to (1) multi-source partial identification and (2) explanation of study estimate discordance or concordance.

Coauthors: Jeremy Labrecque. Department of Epidemiology, Erasmus MC University Medical Center Rotterdam, The Netherlands

T02: The effect of antibiotic escalation in critically-ill hematologic and oncologic patients: A target trial emulation using electronic health records data

Author: Kevin Kopp, Laboratory for Clinical Research and Real-World Evidence, Department of Hematology & Stem Cell Transplantation - West German Cancer Center - University Hospital Essen, Germany

Abstract

Sepsis is the most common cause of death in hospitalized patients. Its cornerstone is the use of broad-spectrum antibiotics, yet their excessive use has resulted in antimicrobial resistance. Guidance on optimal antibiotic treatment for cancer patients with non-improvement to first line therapy remains limited. Thus we aim to determine the causal effect of various commonly used antibiotic treatment escalations on hospital mortality for cancer patients. Study base were patients admitted to a cancer ward of the University Hospital Essen between 2020-25. We compared whether (a) the switch to a carbapenem or (b) addition of a gram-positive targeting antibiotic to a broad-spectrum penicillin resulted in better survival. To address time-dependent treatment decisions and delayed treatment initiations, we applied a clone-censor-weighting approach in order to emulate a target trial. Preliminary results (N=3219) indicate a significant increase in hospital mortality for a switch to a carbapenem (OR=1.60, 95%-CI 1.30-1.96). In contrast, the addition of gram-positive antibiotics was associated with no increased mortality during the hospital stay (OR=1.15, 95%-CI 0.87-1.54). Our results question the current antibiotic escalation strategy.

Coauthors: Gernot Pucher (1,2); Aman Deep (1,2); Rodrigo Huerta Gutiérrez de Velasco (3); Till Rostalski (2); Felix Nensa (2); Christian Reinhardt (1); Jens Kleesiek (2); Christopher Martin Sauer (1,2); 1: Laboratory of Clinical Research and Real-World Evidence, Department of Hematology & Stem Cell Transplantation, West German Cancer Center, University Hospital Essen, Essen, Germany; 2: Institute for Artificial Intelligence in Medicine, University Hospital Essen, Essen, Germany; 3: Institute for Public Health, Charité, Berlin, Germany

T03: Longitudinal Targeted Maximum Likelihood Estimation (LTMLE) for Two-Stage Designs

Author: Kirsten Landsiedel, Graduate Group in Biostatistics, University of California, Berkeley, United States

Abstract

We consider longitudinal survival studies where individuals are followed until failure or random end-of-study and some are lost to follow-up (LTFU) before either event is recorded. When LTFU is informative, artificial censoring can bias estimation of marginal survival or causal effects on survival. Resampling designs address this by taking an additional sample of those LTFU (and not censored) and retrospectively collecting their outcome, representing a special case of two-stage designs. Exploiting this connection, we extend inverse weighted targeted minimum loss-based estimation (IPCW-TMLE) to longitudinal resampling designs, introducing IPCW-LTMLE, and show that estimating and targeting known sampling probabilities can further reduce variance by 30-45%. This problem can also be viewed as a causal inference data structure with intervention nodes at the end-of-study and sampling indicator and propose a novel LTMLE. In simulations, both estimators have 30-45% lower variance than the commonly used weighted Kaplan-Meier. Since the observed data arise from a bivariate censored version of an underlying full data process, fully efficient closed-form estimators do not exist, motivating our highly efficient but closed-form approaches.

Coauthors: Maya Petersen (Graduate Group in Biostatistics, University of California, Berkeley, United States), Mark van der Laan (Graduate Group in Biostatistics, University of California, Berkeley, United States)

T04: A Bayesian, Graph-Free Approach to Identifying Transportable Causal Effects

Author: Konstantina Lelova, Department of Mathematics and Applied Mathematics, University of Crete, Greece

Abstract

Transporting causal information across populations is a critical challenge in clinical decision-making. Causal modeling provides criteria for identifiability and transportability, but these require knowledge of the causal graph, which rarely holds in practice. We propose a Bayesian method that combines observational data from the target domain with experimental data from a different domain to assess the identifiability and transportability of Z-specific (conditional) causal effects by identifying s-admissible backdoor sets, without requiring knowledge of the causal graph. We prove that if such a set exists, we can always find one within the Markov boundary of the outcome, narrowing the search space, and we establish asymptotic convergence guarantees for our method. We develop a greedy algorithm that reframes transportability as a feature selection problem, selecting conditioning sets that maximize the marginal likelihood of experimental data given observational data. In simulated and semi-synthetic data, our method correctly identifies transportable causal effects and improves causal effect estimation.

Coauthors: Sofia Triantafillou - Department of Mathematics and Applied Mathematics, University of Crete, Greece | Gregory F. Cooper - Department of Biomedical Informatics, University of Pittsburgh, USA

T05: Mapping the Interplay of Clinical and Genetic Factors in Intracranial Aneurysms Using Additive Bayesian Networks

Author: Lea Bührer, Centre for Computational Health, Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland/ Department of Mathematical Modeling and Machine Learning, University of Zurich, Zürich, Switzerland

Abstract

Unruptured intracranial aneurysms (UIAs) affect 3–5% of the population and are increasingly detected through widespread imaging. Although rupture is rare, the consequences are severe, and both invasive and conservative management carry risks. This underscores the need for quantitative tools that integrate individual risk profiles and the interplay of phenotypic and genotypic factors.

We aim to clarify the causal structure underlying UIA aetiology, progression, and management. Using additive Bayesian networks (ABNs) implemented in the R package abn, we model probabilistic dependencies among clinical, demographic, and genetic variables, complemented by regression analyses. This approach has been applied to two clinical IA cohorts.

Analyses demonstrate that ABNs can integrate longitudinal data, multiple outcomes, and causal-like dependencies in an interpretable way. Current extensions incorporate genetic risk scores to refine estimates of susceptibility and rupture risk. As a next step, we are looking at a Swiss reference cohort without aneurysms to quantify and correct selection bias in the clinic-based samples, improving inference and model generalizability.

Coauthors: Mark K. Bakker: Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, the Netherlands

Abiram Sandralegar: Division of Neurosurgery, Geneva University Hospitals, Geneva, Switzerland

Sandrine Morel: Division of Neurosurgery, Geneva University Hospitals, Geneva, Switzerland/ HUG NeuroCentre, Geneva University Hospitals, Geneva, Switzerland

Philippe Bijlenga: Division of Neurosurgery, Geneva University Hospitals, Geneva, Switzerland

Reinhard Furrer: Department of Mathematical Modeling and Machine Learning, University of Zurich, Zürich, Switzerland

Sven Hirsch: Centre for Computational Health, Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland

Georg R. Spinner: Centre for Computational Health, Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland

T06: Causal Variance Decompositions for Measuring Health Inequalities

Author: Lin Yu, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada

Abstract

In hospital profiling, comparing hospital performance across sociodemographic groups may reveal inequalities/disparities in healthcare delivery or outcomes. Such disparities may result from causal factors, such as unequal hospital access or differential care provided/effect modification. Recent literature has proposed effect decomposition methods to quantify such inequalities, but these approaches are typically limited to pairwise comparisons. We consider polytomous exposure and group membership variables and decompose the observed outcome variance as the quantity of interest. We formulate a new causal variance decomposition framework, which attributes the observed variation to eight components including new terms characterizing modification of the hospital effect by group membership, hospital access, and correlation between the two. We address exposure-induced mediator-outcome confounding via path-specific and interventional effects. We adapt the decomposition to survival outcomes via restricted mean survival time. We discuss the causal interpretation, formulate parametric and nonparametric estimators and evaluate these through simulation. Finally, we illustrate our method using cancer care delivery data from the SEER database.

Coauthors: Zhihui (Amy) Liu, Department of Biostatistics, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada;

Kathy Han, Department of Radiation Oncology, Princess Margaret Hospital, University Health Network, Toronto, ON, Canada;

Olli Saarela, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada

T07: Empirical Investigations of the Causal Notion of Unidimensionality in Psychometric Testing

Author: Lorenzo Gasparollo, EPFL

Abstract

Notions of causality are foundational to psychometric testing in the social sciences because they determine how the latent construct that a test purports to measure (e.g., well-being) relates to the observed item scores that constitute the test (e.g., a score of 4 on the item “How satisfied are you with your life?”, measured on a 1–5 scale). Yet, conventional procedures in test design do not explicitly incorporate causal notions.

In this talk, I reinterpret unidimensionality of latent constructs as a causal assumption, a historically important property that ensures all items in a given test relate to the same latent construct of interest. Using data from the Health and Retirement Study, I analyze multiple tests derived from the Satisfaction With Life Scale and, building on VanderWeele et al. (2022), show how canonical psychometric properties are insufficient to both distinguish and guide the development of tests endowed with a unidimensional causal interpretation compatible with that of prevailing frameworks for causal inference.

Thus, these findings empirically demonstrate the challenging nature of, and offer new directions for, methodological work at the interface of causal inference and psychometrics.

Coauthors: Lena Vogel (University of Geneva)

T08: A new sensitivity analysis and improved bounds for the average treatment effect with instrumental variables

Author: Luca Locher, University of Zurich (Epidemiology, Biostatistics and Prevention Institute)

Abstract

Policy decisions often depend on whether the average treatment effect (ATE) exceeds a specific cutoff. In instrumental variable (IV) settings, point identification of this parameter requires strong assumptions that are often implausible. In such cases, researchers may target bounds of the ATE or shift their attention to the local (L) ATE, which is identifiable under weaker assumptions. Both approaches have well-known drawbacks: assumption-free bounds are often too wide to be informative, and the substantive relevance of the LATE is frequently questioned. We propose two new strategies that allow researchers to maintain their focus on the ATE: A simple sensitivity analysis with easily interpretable parameters and improved bounds that enable identification of the sign of the ATE under reasonable assumptions. Conceptually, we derive these results by finding thresholds of the latent quantities that prevent point identification of the ATE. Using our results, researchers can decide, in a principled manner, whether the sign is inferable under assumptions they deem credible. The proposed strategies are easy to implement, and we illustrate their value with several real-data examples from economics and medicine.

Coauthors: Mats J. Stensrud (EPFL)

T09: Bayesian Supervised Causal Clustering for Heterogeneous Treatment Effects

Author: Luwei Wang, University of Edinburgh, School of Informatics

Abstract

Finding patient subgroups with similar characteristics is crucial for personalized decision-making in various disciplines such as healthcare and policy evaluation. While most existing approaches rely on unsupervised clustering methods, there is a growing trend toward using supervised clustering methods that identify operationalizable subgroups in the context of a specific outcome of interest. We propose Bayesian Supervised Causal Clustering or BaSiCCs, with treatment effect as outcome to guide the clustering process under the assumptions of unconfoundedness and randomization. BaSiCCs identifies homogenous subgroups of individuals who are similar in their covariate profiles as well as their treatment effects. We validate and compare BaSiCCs on simulated datasets to demonstrate its ability in handling multiple mixed covariates and nonlinear relationships between covariates and potential outcomes. We also apply BaSiCCs to a real-world dataset from the third International Stroke Trial to assess the practical usefulness of the framework.

T10: Pitfalls of inappropriate auxiliary variable selection for inverse probability weighting in causal inference

Author: Liping Wen, MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom.

Abstract

Inverse probability weighting (IPW) is used to adjust for selection bias in causal inference. IPW model guidance is to include auxiliary variables related to both outcome and missingness, to make missing at random (MAR) more plausible. We investigated how auxiliary variables affect bias of IPW via a simulation study. We simulated three scenarios: (1) auxiliary is a collider for causes of outcome and missingness, outcome MAR; (2) auxiliary predicts both missingness and outcome, outcome MNAR, (3) auxiliary predicts only missingness, outcome MNAR. In scenario 1, bias is induced when including a collider in IPW. In scenario 2, including the auxiliary can reduce or increase bias, depending on size and direction of associations with other variables. In scenario 3, including an auxiliary variable related only to missingness amplifies the existing bias, with greater bias when this variable is more strongly related to missingness. We recommend excluding any auxiliary variable that is related to missingness but not to the outcome. Whether a potential auxiliary variable causes missingness and outcome or is a collider for causes of missingness and outcome, cannot be ascertained from the observed data. Therefore, drawing a DAG is crucial.

Coauthors: Full names: Apostolos Gkatzionis, Kate Tilling and Rachael Hughes. Affiliation: MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom.

T11: Estimating Local and Spillover Effects in the Presence of Interference and Latent Spatial Confounders

Author: Mafalda Batalha, European University Institute

Abstract

Causal inference in spatial settings is often complicated by unmeasured spatial confounding and interference across units. To address these issues, this paper adapts the framework of Papadogeorgou&Samanta(2024) to a panel-data setting. I define causal estimands capturing direct and spillover effects, clarify the assumptions needed for identification and explore a Bayesian approach for estimation. By allowing outcomes to depend on local and neighbouring treatment levels and exploiting spatial structure, the framework separates causal effects from latent spatial confounders.The methodology is illustrated using short-term rental expansion and its effects on residential migration across neighbourhoods in Florence.Latent neighbourhood characteristics jointly shape rental activity and migration patterns, while activity in one area may affect outcomes in nearby areas through spillovers.These features generate strong spatial dependence in both treatment intensity and outcomes, making standard panel regression approaches inadequate.The application shows how the proposed framework isolates causal variation that is orthogonal to unobserved spatial factors, offering a principled approach to causal inference in spatial social science settings.

T12: Causal-Audit: A Framework for Risk Assessment of Assumption Violations in Time-Series Causal Discovery

Author: Marco A. Ruiz, Institute for Systems and Robotics, Instituto Superior Técnico - Lisbon, Portugal

Abstract

Time-series causal discovery methods assume requirements such as stationarity, regular sampling, and causal sufficiency. When those are violated, structure leaning results are confident but misleading. False positive rates have been shown to exceed 90% for critical datasets. We introduce Causal-Audit, an open-source platform for pre-discovery assumption-checking as calibrated risk assessment. The software computes effect-size diagnostics across five assumption families, aggregates them into calibrated risk scores with credible intervals, and implements an abstention-aware decision policy that recommends methods (PCMCI+, Granger, LPCMCI, Transfer Entropy) only when evidence supports reliable inference. Risk bounds explicitly quantify robustness to assumption violations, determining when methods are safe to apply.

Evaluation on three benchmarks demonstrates that Causal-Audit effectively prevents unreliable causal discovery: near-perfect risk calibration, 62% false positive reduction when methods are recommended, and 83% abstention precision, identifying when violations preclude reliable inference. We release the benchmark dataset and open-source implementation, operationalizing assumption reporting for causal discovery.

Coauthors: Miguel Arana-Catania (Digital Scholarship at Oxford, University of Oxford); David R. Ardila (Jet Propulsion Laboratory, California Institute of Technology); Rodrigo Ventura (Institute for Systems and Robotics, Instituto Superior Técnico, University of Lisbon)

T13: Identifying heterogeneous treatment effects in antidepressant response using causal machine learning in UK Biobank

Author: Margaux Törnqvist, Theremia & NeuroDiderot, Inserm - Université Paris Cité, France

Abstract

In major depressive disorder, the high rate of non-response to antidepressants (AD) reflects inter-individual heterogeneity in treatment benefit-risk profiles. Despite this clinical challenge, causal machine learning (ML) has been rarely applied to large-scale observational data, due to complex confounding structure, outcome definition, and data quality limitations.

We investigated the potential and limitations of causal ML to identify and validate heterogeneous treatment effects in AD response using UK Biobank data from first-time AD users. Treatment response was defined as no AD switch within six months of treatment initiation. Causal forests were trained to estimate conditional average treatment effects (CATE) and to stratify patients into high and low responder subgroups. Effect modifiers were explored using variable importance and their relation with CATEs, while heterogeneity was assessed using best linear predictor tests.

Neuroticism, adiposity, age, and BMI emerged as potential effect modifiers of differential AD response. Beyond these findings, we highlighted key challenges in applying causal ML to observational data and illustrated its promise and current limitations for advancing personalised treatment strategies.

Coauthors: Mathieu Even (Premedical, Inria - Inserm, France) Chloé Geoffroy (Theremia, France)

T14: Identification of stochastic interventions in multi-state models

Author: Mark Bech Knudsen, Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark

Abstract

Many causal questions can be framed in terms of interventions on the transition intensities of a multi-state model, but despite being well established tools in classic time-to-event analysis, multi-state models are rarely used for causal inference. As an example, patients with esophageal cancer might develop dysphagia at a random time post-baseline, in which case some invasive procedure is required. One might then be interested in the effect of intervening on the choice of treatment. We give a novel graphical criterion, based on the state transition graph, for identification of the effect of such interventions in the presence of unmeasured variables possible affecting the intensities. This works for general multi-state models. We argue that this provides a more practical understanding of what confounding means in a multi-state setting. We illustrate application of the criterion in multiple concrete models.

Coauthors: Helene Charlotte Wiese Rytgaard, Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark Torben Martinussen, Section of Biostatistics, Department of Public Health, University of Copenhagen, Denmark

T15: A General Framework for Comparative Simulation Studies in Causal Inference

Author: Mathilde Dicaire-Cartier, Institute for Medical Information Processing, Biometry, and Epidemiology, Faculty of Medicine, LMU Munich, Germany and Munich Center for Machine Learning, Munich, Germany

Abstract

With the rapid growth of estimation methods in causal inference, it is becoming increasingly challenging to identify those that are truly the most appropriate for a given situation. To evaluate these methods, researchers often rely on simulation studies. However, these studies are often designed to promote newly proposed methods and use scenarios that highlight the strengths of the method being evaluated. In particular, simulation settings are generally tailored to situations where the proposed method excels and competing methods are less appropriate, resulting in over-optimistic statements regarding its performance.

We propose a general framework for designing neutral comparative simulation studies that assess the performance of estimation methods in (longitudinal) causal inference. Unlike existing comparative simulation frameworks that focus on specific data applications, our approach is designed to compare methods from a methodological point of view. To this end, we first present a review of recent literature on simulation practices. Then, we use the Causal Roadmap to introduce components to consider in comparative analyses, such as the different types of simulations, causal scenarios, estimands, and performance measures.

Coauthors: Michael Schomaker (Department of Statistics, LMU Munich, Germany, Centre for Infectious Disease Epidemiology and Research, University of Cape Town, Cape Town, South Africa, Munich Center for Machine Learning, Munich, Germany) and Anne-Laure Boulesteix ( Institute for Medical Information Processing, Biometry, and Epidemiology, Faculty of Medicine, LMU Munich, Germany and Munich Center for Machine Learning, Munich, Germany)

T16: Causal Discovery in Dynamic Networks

Author: Melania Lembo, Faculty of Informatics, Università della Svizzera italiana, Switzerland

Abstract

In network science, as in other applied research, key questions often require distinguishing causal mechanisms from context-dependent associations. One formalization of causality uses invariance: a relationship between a target variable and a set of covariates is causal if it remains stable across diverse environments or interventions. Originally developed for linear models, this approach has been extended to more flexible frameworks, including generalized linear models, where causal invariance is expressed through the Pearson risk. In this talk, we extend causal invariance to dynamic networks, focusing on Relational Event Models (REMs), which describe interactions over time between social actors. Because the target—the dynamic network—and potential causal drivers are stochastic processes, dependences among them are characterized using conditional local independence. Leveraging a connection between REMs and logistic regression, we extend Pearson risk invariance to this dynamic setting, producing a causal discovery algorithm that only requires data from a single observational environment. We show how this approach can identify, and distinguish, important causal mechanisms in network science, such as social influence and homophily.

Coauthors: Veronica Vinciotti (University of Trento), Ernst C. Wit (Università della Svizzera italiana)

T17: Integrating Compositional Oral Microbiome Data into Causal Inference Frameworks: Estimating Conditional Average Treatment Effects for Personalised Head and Neck Cancer Therapy

Author: Michele Salerno, CMON Lab, Data Science Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy

Abstract

Mounting evidence suggests that the oral microbiome can influence treatment response in head and neck cancer (HNC) yet integrating microbiome data into causal frameworks remains challenging due to their high dimensionality, sparsity, and compositional nature. We developed a framework combining unsupervised clustering of baseline oral microbiota to estimate Conditional Average Treatment Effects (CATE) of chemoradiotherapy versus radiotherapy on progression-free survival in 181 HNC patients, for subgroups with different microbial profiles identified using hierarchical clustering. There is some suggestion that treatment effects differ by baseline oral microbiota composition: patients with high microbial diversity showed superior outcomes with radiotherapy alone, suggesting microbiota profiling could guide treatment de-escalation and spare chemotherapy toxicities. Our work addresses methodological gaps in integrating compositional high-dimensional data into causal inference frameworks, highlighting necessary extensions for handling compositional covariates in treatment effect estimation and demonstrating the feasibility of microbiota-based treatment personalization.

Coauthors: Alfieri Salvatore (Medical Oncology, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy); Iacovelli Nicola Alessandro (Radiation Oncology, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy); Franceschini Marzia (Radiation Oncology, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy); Rancati Tiziana (CMON Lab, Data Science Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy); De Cecco Loris (Experimental Oncology, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy); Keogh Ruth (Department of Medical Statistics, London School of Hygiene & Tropical Medicine, United Kingdom); Iacovacci Jacopo (CMON Lab, Data Science Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy)

T18: Combining data from randomised controlled trials and real-world data to optimise radiotherapy outcomes

Author: Miriam Ngarega, Division of Informatics, Imaging & Data Sciences, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom

Abstract

Causal methods estimate effects beyond randomised controlled trials (RCTs) using real world data (RWD) amid biases e.g. confounding and selection. A new frontier combines RCTs and RWD to use their complementary strengths. I will share my PhD’s aim to contribute to the field and elicit feedback, toward application in radiotherapy (RT), where RCTs and RWD alone fall short. Narrow eligibility limits the external validity of RT RCTs. While generalisability and transportability methods can reduce this bias, intervention application differs between trial and target populations. RT RCTs are done in specialised centres, with learning curves for new techniques, causing differences in RT delivery between RCTs and routine care, termed trial engagement. My PhD will examine how RCTs can be generalised in the presence of trial engagement. Combining RCTs and RWD can help study RT’s progressive, irreversible long-term (late) effects, which occur long after treatment and are costly to evaluate in RCTs. Late effects are observed irregularly and informatively in RWD. RCTs can inform confounding structures for application in RWD. My PhD will examine how RCTs and RWD can be combined to study late effects with irregularly and informatively observed RWD

Coauthors: Eliana M. Vasquez Osorio, PhD - Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom

Gareth Price, PhD - Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom and Department of Clinical Oncology, The Christie NHS Foundation Trust, Wilmslow Road, Manchester M20 4BX, United Kingdom

Matthew Sperrin, PhD - Division of Informatics, Imaging & Data Sciences, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom

T19: Neural Variance-Efficient Adjustment for High-Dimensional Causal Inference

Author: Nadja Rutsch, VU Amsterdam

Abstract

Variable selection for causal inference typically prioritizes unbiasedness by identifying a valid adjustment set. However, the conditions required for unbiasedness (e.g. ignorability) are untestable and often infeasible in practice, particularly when data is high-dimensional or confounders are unmeasured. Furthermore, strictly valid adjustment sets can be suboptimal in terms of treatment effect Mean Squared Error (MSE), as they ignore the bias-variance trade-off. We propose VERA (Variance-Efficient Relaxed Adjustment), a modular variable selection component for differentiable causal estimators. VERA replaces hard variable selection with a differentiable stochastic gating mechanism, learning inclusion probabilities for each covariate. The selection is optimized to minimize the variance of the Efficient Influence Function (EIF), which naturally prioritizes variables explaining outcome variation. This implicitly controls certain confounding bias while explicitly minimizing variance. We demonstrate that VERA consistently improves the finite-sample estimation error of various base estimators (e.g., TARNet, R-Learner, S-Learner) in settings where traditional methods fail, such as high-dimensional and small-sample regimes.

Coauthors: Sara Magliacane, Stéphanie van der Pas

T20: Causal Identification via DAG and ADMG Simplification in Wearable Parkinson’s Disease Studies

Author: Nawfal Zakar, School of Computer Science, University of Birmingham, Birmingham, UK.

Abstract

Parkinson’s disease (PD) is a neurodegenerative condition that is characterised by both motor and non-motor symptoms. During the last few years, wearable devices have started to be used in the clinical practice for monitoring patients’ PD-related motor symptoms. However, most studies to date rely on associations that may be confounded and clinically misleading when causal structure is ignored. The purpose of this study is to introduce non-parametric causal modelling using Directed Acyclic Graphs (DAGs) and Acyclic Directed Mixed Graphs (ADMGs) to identify causal effects of interest using large-scale population data. Although many standard techniques causal effect estimators exist, such as inverse probability weighting, causal bootstrapping, and average treatment effect estimators, their use depends on the structure of the graph. We demonstrate how complex causal graphs involving latent confounding and redundant variables can be simplified using latent projection and graphical reduction while preserving the causal effect of interest. For PD, we demonstrate a plausible model which results in a DAG satisfying the backdoor criterion, thereby establishing the ignorability assumption and enabling the use of standard adjustment methods.

Coauthors: MaxA.Little / School of Computer Science, University of Birmingham, Birmingham, UK. Abdulrahman Aloyayri / School of Computer Science, University of Birmingham, Birmingham, UK.

T21: Unraveling time-varying causal effects of multiple exposures: integrating Functional Data Analysis with Multivariable Mendelian Randomization

Author: Nicole Fontana, MOX, Department of Mathematics, Politecnico di Milano, Italy / Health Data Science Research Centre, Human Technopole, Italy

Abstract

Traditional Mendelian Randomization (MR) assumes constant causal effects, ignoring that many risk factors exhibit dynamic effects across the life course. Moreover, multiple exposures often act simultaneously or influence each other through mediation pathways. We introduce Multivariable Functional Mendelian Randomization (MV-FMR)[1], a novel framework that extends univariable[2] to jointly model multiple time-varying exposures by integrating functional data analysis within the MR framework. Through extensive simulations examining horizontal pleiotropy and mediation pathways, MV-FMR consistently recovered true time-varying effects, outperforming univariable approaches. We applied the method to UK Biobank data, to investigate systolic blood pressure (SBP) and body mass index (BMI) effects on coronary artery disease across ages 50-70. Both exposures showed strongest effects at ages 50-60, attenuating beyond age 60. BMI effects attenuated in multivariable versus univariable analysis, confirming partial mediation through BP pathways. This framework enables precision prevention strategies by revealing how multiple risk factors jointly influence disease risk across the lifespan. [1] arxiv.org/abs/2512.19064 [2] doi.org/10.1002/sim.10222

Coauthors: Francesca Ieva (MOX, Department of Mathematics, Politecnico di Milano, Italy / Health Data Science Research Centre, Human Technopole, Italy), Luisa Zuccolo (Health Data Science Research Centre, Human Technopole, Italy), Emanuele Di Angelantonio (Health Data Science Research Centre, Human Technopole, Italy) and Piercesare Secchi (MOX, Department of Mathematics, Politecnico di Milano, Italy)

T22: Nonparametric Bounds in Competing Risks Settings

Author: Nils Leitzinger, Department of Biostatistics, University of Oslo, Norway

Abstract

In competing risk settings, point-identification of causal effects often fails due to unmeasured confounding. When point-identification is not possible, symbolic bounds can sometimes be derived which, when estimated, provide a range of causal effect values consistent with observed data. We derive nonparametric symbolic bounds for causal differences in the cause-specific cumulative incidences in randomised experiments with confounded competing events, for controlled direct, natural direct and indirect, and separable direct and indirect effects. Using linear programming, we obtain closed-form bounds that explicitly incorporate a competing risk constraint. We compare the width analytically and coverage of the causal null via simulation between these novel bounds and previously derived bounds. We demonstrate the practical utility of our approach in a prostate cancer trial where prostate cancer death and cardiovascular death are competing events that are confounded by important unmeasured lifestyle and health factors such as current and past smoking status.

Coauthors: Erin Gabriel and Michael Sachs, Section of Biostatistics, University of Copenhagen, Denmark

T23: A super learner for heterogeneous net treatment effects

Author: Eva-Maria Oeß, University of Cologne, Department of Economics, Germany

Abstract

We introduce a super learner for estimating heterogeneous net treatment effects under unit-varying outcome and cost effects. Our approach is designed for optimal assignment of a binary treatment that induces a cost–benefit trade-off. The net effect and its underlying outcome and cost components are characterized by unknown functional complexity, which our ensemble explores in a data-driven manner. Directly targeting the net effect performs well when the estimand is simpler than the outcome and cost effects individually. Separately learning both effects and subsequently aggregating them into the net effect is advantageous when the underlying components share comparable functional complexity that translates into a similarly complex target estimand. A hybrid learning strategy succeeds in intermediate settings. Our ensemble nests all approaches and selects the winner by minimizing empirical risk. In a simulation study, we consider scenarios in which each approach dominates the others and show that the ensemble improves precision across most settings. Finally, we use data from a nonprofit organization to analyze the net effect of a fundraising campaign aimed at increasing pledge payments while mitigating donor attrition.

Coauthors: Lennard Maßmann, University of Duisburg-Essen

T24: Clustering and Pruning in Causal Data Fusion

Author: Otto Tabell, University of Jyväskylä

Abstract

Data fusion, the process of combining observational and experimental data, can enable the identification of causal effects that would otherwise remain non-identifiable. Although identification algorithms have been developed for specific scenarios, do-calculus remains the only general-purpose tool for causal data fusion, particularly when variables are present in some data sources but not others. However, such approaches may encounter computational challenges as the causal graph grows in complexity. Consequently, there exists a need to reduce the size of such models while preserving the essential features. For this purpose, we propose pruning and clustering as preprocessing operations for causal data fusion. We generalize earlier results on a single data source and derive conditions for applying pruning and clustering in the case of multiple data sources. We give sufficient conditions for inferring the identifiability or non-identifiability of a causal effect in a larger graph based on a smaller graph and show how to obtain the corresponding identifying functional for identifiable causal effects. Examples from epidemiology and social science demonstrate the use of the results.

Coauthors: Santtu Tikka (University of Jyväskylä), Juha Karvanen (University of Jyväskylä)

T25: Quantifying Practical Overlap in Causal Inference via KL Projections

Author: Park Geondo, Department of Statistics, Seoul National University, South Korea

Abstract

Assessing overlap is an essential task in causal inference, as limited overlap undermines identifiability and leads to unstable estimators. In practice, overlap is most often evaluated through visual inspection of propensity score distributions. Although overlap is frequently discussed in connection with methods that improve estimation, such as trimming or overlap weighting, there is limited work on directly measuring how much overlap is present. We propose a likelihood-based framework for quantifying practical overlap using Kullback-Leibler projections. The approach defines a common component as the distribution that best approximates the treated covariate distribution while remaining representable as a mixture component of the control population. This construction yields a smooth, distribution-level characterization of overlap that avoids explicit density estimation. An overlap parameter is defined as the largest fraction of the control population that can be retained while maintaining sufficient distributional proximity to the treated group. Simulations and empirical examples with known overlap challenges illustrate how the proposed measure provides a principled early-stage diagnostic prior to causal effect estimation.

Coauthors: Juyeon Kim and Kwonsang Lee, Department of Statistics, Seoul National University, South Korea

T26: Quadratic Optimal Matching: A Design-Based Framework for Control Subset Geometry

Author: Park Sangyung, Department of Statistics, Seoul National University, South Korea; Quantum·AI research center, Korea Quantum Computing Co., Ltd., South Korea

Abstract

Matching is a core design tool in causal inference, yet many existing methods encode comparability indirectly—either through local pairwise assignments or through positive semidefinite (PSD) objectives tied to RKHS. As a result, the geometry of the selected control group is largely implicit, limiting explicit control over overlap and stability. We propose Quadratic Optimal Matching (QOM), a design-based framework that formulates matching as a quadratic optimization problem over control subset indicators. QOM directly specifies control–control geometry through a quadratic form, while treated–control proximity enters through linear terms and constraints. Kernel optimal matching arises as a special case when the quadratic objective is PSD; allowing indefinite quadratic forms expands the design space beyond kernel-based formulations and enables explicit subset-level design control. The resulting problems admit a quadratic unconstrained binary optimization (QUBO) representation, making QOM well suited for quantum annealing and hybrid quantum–classical solvers. We show that non-PSD objectives yield qualitatively different matched designs, highlighting the role of subset geometry and computation in design-based causal inference.

Coauthors: Hajoung Lee - Institute for Data Innovation in Science, Seoul National University, South Korea Kwonsang Lee - Department of Statistics, Seoul National University, South Korea

T27: Learning long-term outcomes of new interventions in a target population using short-term outcomes from randomized trials

Author: Qingyang Shi, Groningen Research Institute of Pharmacy

Abstract

New methods have been proposed to learn long-term outcomes by combining short-term outcomes from a randomized trial and confounded long-term outcomes from a observational source. Those methods require using treatment information on long-term outcomes in the observational source. However, we consider the treatment information is not always available in the observational sources, especially seeking for a long-term decision for any new interventions or medications proved by trials (e.g., health technology assessment and health economics evaluation). In this paper, we study identifying the long-term potential outcome means (or treatment effects) in the observational source without any treatment information involved (i.e., uniform use of control). We analyzed the situations with and without unmeasured confounding between short-term and long-term outcomes. We proposed the estimation methods including iterative regression, forward simulation, and influence-function estimating equation. A simulation study was performed for the finite sample performance, and an application was illustrated.

Coauthors: Issa Dahabreh, Harvard T.H. Chan School of Public Health

T28: Non-parametric recovery of causal diffusion mechanisms from equilibrium observations

Author: Richard Schwank, Statistics group, Technical University of Munich, Germany

Abstract

When studying a causal system, it can be beneficial to consider its time evolution, particularly if there are feedback loops that are expected to unfold at small timescales. However, in many practical scenarios (e.g., for single-cell gene expression data), observations can only be obtained at a single point in time. We present methodology to recover the system’s time-infinitesimal transition mechanism from such cross-sectional data. Precisely, we assume the system follows a time-homogeneous diffusion process that has reached an equilibrium distribution at observation time. Further, we assume the causal mechanism is fully described by the diffusion drift, is acyclic, and its causal graph is known. In this setting, we show that the full causal mechanism, i.e., the drift function, can be non-parametrically identified under mild conditions, and we provide a practical algorithm to solve this challenging inverse problem.

Coauthors: Mathias Drton, Statistics group, Technical University of Munich, Germany

T29: Intervention-Centric Causal Discovery in Business Processes

Author: Romain Bérard, Research Center for Information Systems Engineering (LIRIS), Faculty of Economics and Business, KU Leuven, Belgium

Abstract

Business processes record sequences of activities and their attributes in event logs. Identifying which decisions and variables causally influence downstream KPIs (e.g. throughput time) and how these drivers differ across decision points requires causal discovery. Event logs challenge standard causal discovery methods due to irregular, decision-rich control flow over a high number of activity sequences, mixed data types, alongside potential unobserved organizational factors which obscure past interventions. We propose an intervention-centric approach for process mining: at each intervention point, local causal relations among state variables, treatment, and KPI are learned. Control-flow based background knowledge is encoded to constrain and stabilize discovery. We benchmark mixed-data techniques spanning constraint-, score-, functional-, and hybrid-based families under selection bias and latent confounding. Experiments use two policy-driven simulators with ground truth (SimBank, SimBPIC17), varying sample size, noise variables, and partially randomize decisions to weaken policy-driven confounding. We assess structural quality and runtime, highlighting which assumptions and method classes are most reliable under this setting.

Coauthors: Johannes De Smedt: Research Center for Information Systems Engineering (LIRIS), Faculty of Economics and Business, KU Leuven, Belgium Jochen De Weerdt: Research Center for Information Systems Engineering (LIRIS), Faculty of Economics and Business, KU Leuven, Belgium

T30: Identifiability in continuous Lyapunov models with heterogeneous noise

Author: Sarah Lumpp, Chair of Mathematical Statistics, Technical University of Munich, Germany

Abstract

Stationary distributions of multivariate diffusion processes provide a new approach to probabilistic modelling of causal systems in statistics and machine learning. By assuming each observation to arise as a one-time cross-sectional snapshot of a temporal process in equilibrium, they allow one to naturally model dependence structures that may include feedback loops. Specifically, the graphical continuous Lyapunov model consists of Gaussian stationary distributions of multivariate Ornstein-Uhlenbeck processes. Their covariance matrices are parametrized as solutions to the continuous Lyapunov equation, with sparsity assumptions on the processes’ drift matrices represented by a directed graph. Allowing for coordinate-wise different scaling of the driving noise process, we show generic parameter identifiability for certain classes of directed graphs, including tree-structured graphs. Additionally, we prove sufficient conditions for generic non-identifiability and fully characterize the generically identifiable structures in small acyclic graphs up to 6 nodes.

T31: Difference-in-differences for mediation analysis using double machine learning

Author: Sarina Joy Oberhänsli, Department of Economics, University of Fribourg, Switzerland

Abstract

We propose a difference-in-differences framework with mediation for possibly multivalued discrete or continuous treatments and mediators, aimed at identifying the direct effect of the treatment on the outcome (net of effects operating through the mediator), the indirect effect via the mediator, and the joint effects of treatment and mediator, consistent with the framework of dynamic treatment effects. Identification relies on a conditional parallel trends assumption imposed on the mean potential outcome across treatment and mediator states, or (depending on the causal parameter) additionally on the mean potential outcomes and potential mediator distributions across treatment states. We propose ATET estimators for repeated cross sections and panel data within the double/debiased machine learning framework, which allows for data-driven control of covariates, and we establish their asymptotic normality under standard regularity conditions. We assess finite-sample performance in a simulation study and illustrate our approach in an empirical application to the US National Longitudinal Survey of Youth, estimating the direct effect of health insurance coverage on general health and the indirect effect operating through routine checkups.

Coauthors: Martin Huber, Department of Economics, University of Fribourg, Switzerland

T32: Randomization Tests for Model Selection in Causal Inference Under Network Interference

Author: Supriya Tiwari, Indian School of Business, Hyderabad, India

Abstract

Analysis of experimental data becomes challenging when the underlying population is connected by a network. An exposure mapping is a common tool used in the literature to define and estimate spillover effects. These mappings reduce the complexity of the estimand to a lower dimension, facilitating identifiability. It is assumed that this mapping is correctly specified, leaving the choice of the exposure mapping to the analyst. This makes estimators of the spillover effect, for example, the Horvitz-Thompson estimator, vulnerable to model-misspecification biases. While it has been demonstrated that these estimators are robust to certain controlled forms of misspecification, there is a lack of methodological advancements in empirically investigating appropriate exposure mappings. In this paper, we propose a novel design-based model specification framework for causal inference. Building on this, we develop a randomization testing procedure that can test for the correct specification of an exposure mapping model under network interference. We provide theoretical guarantees for the asymptotic validity of the proposed testing procedure. We establish the favorable power properties of our method via an extensive simulation study.

Coauthors: Supriya Tiwari and Pallavi Basu (Affiliation both- Indian School of Business, Hyderabad, India)

T34: A Mendelian Randomization Study of Alcohol Consumption, Socioeconomic Determinents, and Oropharyngeal Cancer

Author: Tareq Al-Ahdal, Section for Oral Health, Heidelberg Institute of Global Health, Heidelberg University, Germany

Abstract

Alcohol consumption is an established risk factor for oropharyngeal cancer, yet heterogeneity in risk across different drinking patterns remains poorly understood. We conducted Mendelian randomization analyses using genetic instruments from large-scale GWAS studies to investigate causal effects of distinct alcohol consumption patterns on risks of oropharyngeal cancers. We found that Higher alcohol intake frequency showed consistent positive associations with oral and pharyngeal cancer risk. In contrast, alcohol consumption with meals appeared protective; however, this association was completely attenuated after adjusting for household income, suggesting the apparent benefit reflects higher socioeconomic status rather than a true protective effect. The frequency-risk relationship remained robust after socioeconomic adjustment, supporting drinking frequency as an independent causal factor. Prevention efforts should prioritize reducing drinking frequency while recognizing that socioeconomic factors substantially confound observed associations, moving beyond total consumption toward pattern-specific guidance

Coauthors: Brenda Cabrera-Mendoza3,4, Stefan Listl¹, Renato Polimanti3,4,5,6,7

¹Section for Oral Health, Heidelberg Institute of Global Health, Heidelberg University Hospital, Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany

3Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA 4Veteran Affairs Connecticut Healthcare System, West Haven, CT, USA 5Department of Chronic Disease Epidemiology, Yale University School of Public Health, New Haven, CT, USA 6Department of Biomedical Informatics & Data Science, Yale University School of Medicine, New Haven, CT, USA 7Wu Tsai Institute, Yale University, New Haven, CT, USA

T35: SLOPE and designing robust studies for generalization

Author: Xinran Miao, Department of Statistics, University of Wisconsin-Madison, United States

Abstract

A popular task in generalization is to learn about a new target population based on data from a source population. This task relies on conditional exchangeability, which assumes that the difference between the source and target populations is fully captured by observed variables. However, this assumption is often violated in practice and cannot be verified with data. Both concerns warrant the need for robust study designs that are inherently less sensitive to violation of the assumption. We propose SLOPE (Sensitivity of LOcal Perturbations from Exchangeability), a simple, intuitive measure that quantifies the sensitivity of a statistical functional to local violation of conditional exchangeability. SLOPE combines ideas from sensitivity analysis and derivative-based robustness measure from Hampel (1974). It quantifies the sensitivity of a design, which is composed of an estimand and a distribution, thereby guiding robust choices of estimands and source/target datasets. Further, SLOPE is shown to be the projection of the influence function onto the residual space, offering an accessible way to derive SLOPE. We illustrate the role of SLOPE in informing robust designs for generalization with a re-analysis of a multi-site experiment.

Coauthors: Jiwei Zhao (University of Wisconsin-Madison), Hyunseung Kang (University of Wisconsin-Madison)

T36: Personalized Treatment Decision-making Framework using Randomized Trials under Transportability with an Application to Semaglutide and Cardiovascular Events in Diabetes

Author: Yiling Zhou, Department of Epidemiology, University of Groningen, University Medical Center Groningen, the Groningen, Netherlands

Abstract

Decision-making aims to maximize expected welfare in a well-defined target population. However, in practice, target population is often ill-defined, or decisions informed by indirect evidence from RCTs or other sources, leading to suboptimal decisions. A well-defined target population is in representative observtional data, where the direct evidence depend on strong causal assumptions. We develop a causal decision-making framework under transportability that formulizes treatment choice as a causal optimization: among a rule set, choose the rule that maximizes the expected potential welfare of target population. We formulize causal assumptions and identify the optimal rule with maximum expected potential welfare by leveraging RCTs evidence. We provide estimation and inference methods for various data avaiablity settings. We show the robust decision-making under causal, model, and sampling uncertainty. As case study, four treatment rules—treat all, treat none, treat based on the guideline, and treat using a linear parametric rule—are compared in terms of population welfare for the use of thrombolytics (alteplase) in patients with acute ischemic stroke within 6 hours of onset.

Coauthors: Qingyang Shi, Groningen Research Institute of Pharmacy, University of Groningen. Issa J Dahabreh, Harvard T.H. Chan School of Public Health.

T37: Causal Representation Learning for Robust Treatment Classification from Multimodal Single-Cell Data

Author: Yasin Ibrahim, University of Oxford

Abstract

Understanding the effects of chemical and genetic perturbations is essential to elucidating how disease manifests and how it can be treated. While imaging captures morphological changes and gene expression measures transcriptional responses, current methods struggle to integrate these complementary data types. Existing multimodal approaches fail to generalize across experimental batches, suggesting they conflate treatment effects with technical artifacts. We extend the sparse variational autoencoder framework to separate treatment-induced effects into shared factors that appear in both imaging and transcriptomics, and modality-specific factors unique to each measurement type.

Using the scGeneScope dataset of transcriptomes and single-cell images across 28 chemical treatments, we learn causal representations where shared latent factors correspond to core biological mechanisms that manifest across both modalities. By focusing on shared causal factors, our model achieves robust performance and generalizes better to new experimental batches than standard fusion approaches. Our framework provides a principled approach to mechanism-of-action discovery that explicitly accounts for the multimodal structure of cellular responses.

Coauthors: Yasin Ibrahim, Luka Kovačević

T38: Likelihood-Based Nonparametric Causal Discovery under Latent Confounding

Author: Yurou Liang, Chair of Mathematical Statistics, Technical University of Munich, Germany

Abstract

Causal discovery with latent confounding amounts to learning an acyclic directed mixed graph (ADMG) over observed variables and unmeasured confounders. Existing approaches often rely on discrete combinatorial search, which becomes computationally prohibitive for large-scale problems. Recent methods alleviate this challenge by introducing a differentiable acyclicity and bow-freeness constraint. However, these methods either assume linear structural equations or model nonlinear causal relationships involving both observed variables and confounders. The latter approximates intractable posteriors via variational inference, resulting in complex objectives and making performance more challenging. In this work, we propose a nonlinear causal model with correlated errors that encode latent confounding. We establish structural identifiability of the proposed model under bow-free graphs and parameter identifiability under ancestral graphs. This model yields a simple maximum-likelihood objective. We further develop a differentiable optimization scheme incorporating constraints for ADMG discovery. Experiments on synthetic and real-world datasets demonstrate that our method achieves competitive performance compared to relevant baselines.

Coauthors: Nils Sturma, Chair of Biostatistics, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland Mathias Drton, Chair of Mathematical Statistics, Technical University of Munich, Germany

T39: Design-Based Inference for Attribution to Causal Interaction

Author: Zion Lee, Department of Statistics, College of Natural Sciences, Seoul National University, Republic of Korea

Abstract

Understanding causal interaction between multiple treatments is central to many scientific questions, yet standard practice typically relies on regression-based interaction terms whose interpretation depends on modeling assumptions. We propose a design-based framework for attributing outcomes specifically to causal interaction, without imposing sharp null hypotheses or outcome regression models. Our estimand, the interaction-attributable effect, counts the number of jointly treated units whose observed outcomes are attributable to the interaction between treatments, extending Rosenbaum’s attributable-effect perspective. We develop novel randomization-based tests that account for uncertainty arising from unobserved potential outcomes. The proposed tests combine multiple contingency-table–based statistics using Wald-type procedures with analytically derived covariance structures, yielding valid inference under complete randomization and matched designs. Simulation studies demonstrate that our approach achieves higher power than conventional regression-based interaction tests. Our framework provides a principled design-based alternative for causal interaction analysis in studies with binary outcomes.

Coauthors: Kwonsang Lee, Department of Statistics, College of Natural Sciences, Seoul National University, Republic of Korea

T41: Weak instrumental variables due to nonlinearities in panel data: A Super Learner Control Function estimator

Author: Monika Avila Marquez, University of Geneva

Abstract

A triangular structural panel data model with additive separable individual-specific effects is used to model the causal effect of a covariate on an outcome variable when there are unobservable confounders with some of them time-invariant. In this setup, a linear reduced-form equation might be problematic when the conditional mean of the endogenous covariate and the instrumental variables is nonlinear. The reason is that ignoring the nonlinearity could lead to weak instruments. As a solution, we propose a panel data model composed of a linear structural equation with a nonlinear reduced form equation. Identification is obtained using instrumental variables and a control function approach. We propose an estimator that we call Super Learner Control Function estimator (SLCFE). The estimation procedure is composed of two steps and sample splitting. First, we estimate the control function using a super learner. In the second step, we use the estimated control function to control for endogeneity in the structural equation. The estimator is consistent and asymptotically normal with a parametric convergence rate. The SLCFE differs from both the plug-in IV estimator and a naive plug-in 2SLS estimator.

T42: A Balanced Framework for Heterogeneous Effects in Difference-in-Differences: Identification and Semiparametric Estimation of Group-Specific Treatment Effects

Author: Nadja van ’t Hoff, University of Amsterdam

Abstract

Understanding how treatment effects vary across groups is central to policy evaluation. In difference-in-differences designs, subgroup and triple-interaction analyses are common but often lack a clear target estimand or yield only conservative inference on group differences. Moreover, comparisons of group-specific average treatment effects on the treated (ATTs) can be misleading when groups differ in covariate composition. We propose the Balanced Group Average Treatment Effect on the Treated (BGATT), a new estimand that isolates treatment effect heterogeneity from differences in covariate distributions under standard parallel trends assumptions. BGATT enables meaningful comparisons of group-specific treatment effects. We derive an influence-function-based estimator with an efficient score that accommodates flexible machine-learning methods and is square-root-n consistent and asymptotically normal, allowing valid inference on both group-specific BGATTs and their differences. Simulation studies show good finite-sample performance. An empirical application to the U.S. Medicaid expansion illustrates how BGATT clarifies whether observed differences across gender reflect true treatment effect heterogeneity or covariate composition.

Coauthors: Nora Bearth (Sunrise) and Torben S. D. Johansen (University of Southern Denmark)

T43: Identifiability and Estimation in Linear Non-Gaussian Steady-State Models

Author: Niels Richard Hansen, Department of Mathematical Sciences, University of Copenhagen, Denmark

Abstract

Recent results have demonstrated the usefulness of stochastic dynamic models as a framework for learning causal structures with cycles. We introduce the class of Linear Non-Gaussian Steady-State Models (LiNG-SSMs) as a dynamic alternative to Linear Non-Gaussian Acyclic Models (LiNGAMs). LiNG-SSMs generate multivariate observational and interventional distributions known as operator self-decomposable (OSD) distributions, which can be interpreted as cross-sectional steady-state distributions from a stochastic dynamical system. The dynamical system allows for a more natural interpretation of cycles than for LiNGAMs and related linear structural equation models. We characterize observational and interventional cumulants for LiNG-SSMs, which are given by higher-order continuous Lyapunov equations. They are notably different from the well known LiNGAM cumulants. Using tools from algebraic geometry, we prove generic identifiability for all connected graphs. Finally, we showcase their practical usage through simulations and applications to real data for example gene regulatory networks.

Coauthors: Cecilie Olesen Recke, Alex Markham (University of Copenhagen), Jeffrey Adams (University of Copenhagen)

T44: Trek-Based Parameter Identification for Linear Causal Models With Arbitrarily Structured Latent Variables

Author: Nils Sturma, Chair of Biostatistics, EPFL, Switzerland

Abstract

In this talk, I will present a novel criterion to certify whether causal effects are identifiable in linear structural equation models with latent variables. In contrast to previous identification methods, we do not restrict ourselves to settings where the latent variables constitute independent latent factors (i.e., to source nodes in the graphical representation of the model). Our novel latent-subgraph criterion is a purely graphical condition that is sufficient for identifiability of causal effects by rational formulas in the covariance matrix. While it targets effects involving observed variables, our new criterion is also useful for identifying effects between latent variables, as it allows one to transform the given model into a simpler measurement model for which other existing tools become applicable. This is joint work with Mathias Drton.

Coauthors: Mathias Drton (Chair of Mathematical Statistics, Technical University of Munich, Germany)

T45: Regression Discontinuity Design for Time-to-Event Outcomes and Informative Censoring

Author: Noah Noah Stegehuis, Division of Primary Care and Population Health, Department of Medicine, Stanford University, USA

Abstract

In medical observational studies, time-to-event outcomes (such as survival times) are frequently used to assess the effect of medical interventions. Two challenges often arise in this setting: (i) treatment guidelines are based on diagnostic test results (e.g., blood pressure above a certain threshold), creating a regression discontinuity design (RDD); and (ii) outcomes are not always fully observed, since some individuals are censored due to drop out or competing events. The time of censoring can be correlated with factors influencing survival times. While RD methods provide a principled framework for causal identification under (i), standard survival models typically fail to address (ii). We propose a novel RD approach for time-to-event outcomes using an accelerated failure time (AFT) model with a copula to explicitly capture dependence between survival and censoring. This framework enables consistent estimation of treatment effects on survival outcomes in the presence of informative censoring. In a simulation study we demonstrate that the method yields unbiased causal effect estimates, offering a flexible tool for causal inference in survival settings where threshold-based treatment assignment and dependent censoring coexist.

T46: Development and validation of a risk score to predict risk of adverse events associated with systemic anti-cancer treatment in late-stage lung cancer: LUng Cancer Improved Decisions (LUCID)

Author: Ofran Almossawi, Great Ormond Street Hospital / UCL Institute of Child Health UK

Abstract

Patients with advanced non-small cell lung cancer (ANSCLC) will suffer serious adverse events caused by systemic anticancer therapy (SACT) resulting in hospital attendance. Estimates of a person’s risk of adverse events under different treatment options could help guide treatment decisions. However, developing such predictions from observational data is challenging due to confounding.

We developed and internally validated a prediction under interventions model for emergency/unplanned admission within 30 days of SACT initiation in ANSCLC adult patients diagnosed from 2016-2019, using linked UK data. Four treatment regimens were considered: A (double agent chemotherapy), B (chemotherapy + immunotherapy), C (immunotherapy), D (single-agent chemotherapy). Prediction models enabling estimation of risk under each treatment were fitted using logistic regression, with control for confounders. Predictive performance was evaluated using K-fold cross validation.

There were 18,000 patients, of whom 36% had the outcome. Regimens A and C tended to give the lowest risk under our model. Predictive performance was modest, and further work will explore alternative methods for development of prediction under interventions models in this setting.

Coauthors: Ofran Almossawi(1,2), Luke Steventon (3,7), Ruth Keogh (4), Karla Diaz-Ordaz (5), Zhe Wang (6), Andrew Challenger (6), David Dodwell (6), Martin Forster (7), Kenneth KC Man (3), Li Wei (3), Sebastian Masento (7), Adam Januszewski (8), Pinkie Chambers (3,7)

1 Great Ormond Street Hospital, London 2 UCL Institute of Child Health, London 3 Research Department of Practice and Policy, UCL School of Pharmacy, London 4 Medical Statistics Department (Faculty of Epidemiology and Population Health), London School of Hygeine & Tropical Medicine, London 5 Department of Statistical Science, UCL, London 6 Nuffield Department of Population Health, University of Oxford 7 University College London Hospital NHS Foundation Trust, London 8 Barts Cancer Centre, St Bartholomew’s Hospital, London

T47: Are Bayesian Networks Typically Faithful?

Author: Philip Boeken, Department of Mathematics, VU Amsterdam

Abstract

Faithfulness is a common assumption in causal inference, often motivated by the fact that the faithful parameters of linear Gaussian and discrete Bayesian networks are typical, and the folklore belief that this should also hold for other classes of Bayesian networks. We address this open question by showing that among all Bayesian networks over a given DAG, the faithful Bayesian networks are indeed `typical’: they constitute a dense, open set with respect to the total variation metric. For Bayesian networks parametrised by conditional exponential families, we show that under mild regularity conditions, the faithful parameters constitute a dense, open set and the unfaithful parameters have Lebesgue measure zero, extending the existing results for linear Gaussian and discrete Bayesian networks. For certain classes of Bayesian networks with uniformly equicontinuous and uniformly bounded densities, the faithful Bayesian networks are open or dense in the weak topology. The topological properties of conditional independence and faithfulness imply the existence of a constraint-based causal discovery algorithm which is consistent on an open and dense, maximal set of Bayesian networks.

Coauthors: Patrick Forré (University of Amsterdam), Joris Mooij (University of Amsterdam)

T48: Recovering Counterfactually Fair Rankings under Latent Mediators

Author: Philip A. Boustani, Department of Statistics, LMU Munich, Munich Center for Machine Learning (MCML), Germany

Abstract

Automated decision-making systems trained on historical data can preserve or amplify inequalities affecting disadvantaged groups. Bias-transforming approaches seek to enforce substantive equality in line with emerging regulatory requirements under EU non-discrimination law, yet their definitions are inherently causal and edge-specific. From a causal perspective, we show that existing bias-transforming methods implicitly assume full observability of the underlying DAG, an assumption rarely met in practice. We therefore study substantive equality when a causal mediator is unobserved. Since exact counterfactuals are not identifiable, we derive Posterior Rank Preservation, a Bayesian procedure for constructing fair counterfactual quantile predictions that preserve individuals’ latent ranks under interventions on the protected attribute. We investigate deep generative proxy-variable frameworks for recovering counterfactually fair rankings in this latent-mediator setting and derive uncertainty quantification via confidence bands with near-nominal frequentist coverage. Under specific conditions, we prove that our estimators yield Bayes-optimal predictions for fair counterfactual quantiles.

Coauthors: Ludwig Bothmann (LMU Munich, MCML)

T101: Policy relevance of causal quantities in networks

Author: Sahil Loomba, Department of Mathematics, Imperial College London, United Kingdom

Abstract

Under network interference, a wide range of causal estimands have been proposed using exposure mappings. We show that many of these estimands can be expressed through one of two distinct orders of averaging over units and treatment assignments, and that these operations generally do not commute. The more common ordering yields estimands resembling averages of outcomes by exposure level, whose contrasts admit interpretations as unit-level causal effects. We argue, however, that such estimands are irrelevant, or at least insufficient, for optimal policy choice over treatment assignments. The alternative ordering produces estimands that do not always admit an interpretation as summaries of unit-level causal effects, but are sufficient for evaluating and optimizing policies. This reveals a fundamental tension between individual-level causal contrasts and policy-relevant social objectives in the presence of interference. We show that the expected average outcome — and its contrast across policies — admits a unique representation under both orderings. As a result, it reconciles individual-level causal reasoning with policy optimization, providing a unifying framework for causal inference and decision-making in networked interventions.

Coauthors: Dean Eckles, MIT Sloan School of Management, United States

T102: Reasoning about Fairness under Latent Selection

Author: Sourbh Bhadane, University of Amsterdam

Abstract

Selection bias in observational data can often lead naive causal reasoning conclusions astray. In this work, we examine the question of discrimination, as defined by causal fairness notions, in a given selected subpopulation where we assume that selection happens through truncation. With minimal assumptions about selection mechanisms, we apply recent techniques that model latent selection and analyze existing causal fairness notions that are based on graphical, counterfactual and interventional queries, under selection bias. In addition, we propose alternate causal fairness notions that are more easily interpreted and study the logical relations between these notions. For all notions that we consider we propose statistical tests that use only (conditional) observational data. Finally, we apply this selection-bias-aware analysis to a part of the COMPAS dataset and map out the conclusions under different possibilities of selection.

Coauthors: Joris M. Mooij, University of Amsterdam, Onno Zoeter, Booking.com

T103: Monotone Missing Data: A Blessing and a Curse

Author: Santtu Tikka, University of Jyväskylä

Abstract

Monotone missingness is commonly encountered in practice when a missing measurement compels another measurement to be missing. Because of the simpler missing data pattern, monotone missing data is often viewed as beneficial from the perspective of practical data analysis. However, in graphical missing data models, monotonicity has implications for the identifiability of the full law, i.e., the joint distribution of actual variables and response indicators. In the general nonmonotone case, the full law is known to be nonparametrically identifiable if and only if specific graphical structures are not present. We show that while monotonicity may enable the identification of the full law despite some of these structures, it also prevents the identification in certain cases that are identifiable without monotonicity. The results emphasize the importance of proper treatment of monotone missingness in the analysis of incomplete data.

Coauthors: Juha Karvanen, University of Jyväskylä

T104: A Unified Framework for Target Trial Emulation: Combining Sequential Trials and Proximal Causal Inference for Time-to-Event Outcomes

Author: Sarwar Mozumder, Statistics and Data Science Innovation Hub, Biostatistics R&D, GSK, London, UK

Abstract

Evidence has emerged on vaccines improving outcomes of secondary diseases. For example, an association has been identified between herpes zoster vaccination and time-to-dementia onset. However, this association could be confounded by health-seeking behaviour (HSB). Interest in this case, is in the effect of vaccine administration, rather than the effect of eligibility through roll-out of a national vaccine program. Therefore, immortal-time bias may arise due to delays between vaccine eligibility and uptake. Furthermore, the assumption of conditional exchangeability is threatened by unmeasured confounding, such as HSB, which can bias effect estimates. Proximal causal inference (PCI) corrects for unmeasured confounding in time-to-event analyses using negative control exposures and outcomes in a two-stage process. The sequential trials approach recasts the time-varying problem into a multiple point-exposure setting by creating a series of “mini-trials”. This enables seamless unification of both methods and the application of PCI to a time-varying exposure problem for survival endpoints using additive hazards models. We illustrate this approach using the above case study as motivation.

Coauthors: Thomas Drury (Statistics and Data Science Innovation Hub, Biostatistics R&D, GSK, London, UK) Adrian Mander (Statistics and Data Science Innovation Hub, Biostatistics R&D, GSK, London, UK)

T105: Simulating from marginal structural models for hazards, cause-specific hazards and subdistribution hazards using general copulas

Author: Shaun Seaman, MRC Biostatistics Unit, University of Cambridge, UK

Abstract

Seaman and Keogh (Biometrical Journal 2024) proposed a method for simulating data compatible with a marginal structural model (MSM) for the hazard of a survival time outcome. Here, I propose two extensions of this method. First, Seaman and Keogh favoured the use of a Gaussian copula, because this enables the function of the confounder history through which the hazard of failure depends on confounders to be interpreted as a risk score. Here, I describe how this interpretation can be preserved even when a non-Gaussian copula is used. Second, I extend Seaman and Keogh’s method to allow simulation of data compatible with a MSM for a cause-specific or subdistribution hazard of failure in the presence of a competing event.

Reference: SR Seaman and RH Keogh. Simulating data from marginal structural models for a survival time outcome. Biometrical Journal, 66:e70010, 2024

T106: Simulating Data for Marginal Structural Survival Time Models with a Markov Assumption on Confounder Effects on the Outcome

Author: Shizhe Li, MRC Biostatistics Unit, University in Cambridge, UK

Abstract

Simulation studies are a key component of the evaluation of statistical methods. Simulating data that are compatible with marginal structural models is not straightforward (Evans and Didelez, 2024, JRSS-B). For understandability of the data simulating model in epidemiological settings, it may be desirable to assume the outcome depends only on the most recent time-varying confounders. A recently proposed simulation approach uses a pair-copula construction (Lin et al., 2025). This method is computationally efficient but makes such a Markov assumption difficult to impose. It has been briefly conjectured that this could be achieved by using Gaussian pair-copulas. We demonstrate that using a Gaussian pair-copula construction alone is insufficient in general to impose the Markov property. We show, however, that by imposing a linear Gaussian structural equation model for the data-generating mechanism of time-varying confounders, the Markov property can be imposed. This work provides a better understanding of this conjecture.

Coauthors: MRC Biostatistics Unit, University in Cambridge, UK

T108: Verifying the Precision of Effect Size Estimators from Cluster Randomised and Multisite Trials: A Simulation Study Using the Educational Platform Trial Simulator, epts

Author: Sungkyung Kang, Department of Mathematical Sciences, Durham University, UK

Abstract

To assess the effectiveness of educational interventions, randomised controlled trials are typically run across multiple schools. Multilevel models are fitted to account for pupil nesting within schools and the trial design – mostly cluster randomised or multisite trials – from which effect size estimates are obtained. However, several conventions exist for estimating effect sizes (total/within and unconditional/conditional) as well as for model fitting (ANCOVA-type or longitudinal models, standardised or non-standardised outcomes). This research examines the sensitivity, precision and robustness of effect size estimates to these choices.

We conduct a simulation study using the R package epts, an educational trial simulator that generates synthetic datasets for randomised controlled trials using customised parameters. The simulation settings are motivated by design parameters corresponding to published Education Endowment Foundation reports. This approach aims to identify optimal trial design characteristics that improve the precision and magnitude of estimated effect sizes for an educational outcome such as pupil attainment. The findings are expected to provide practical reference for the design of future educational trials.

Coauthors: Sungkyung Kang (Durham University), Germaine Uwimpuhwe (Durham University), Akansha Singh (Durham University and Newcastle University), Mohammad Sayari (Durham University), Nasima Akhter (Teesside University), and Jochen Einbeck (Durham University)

T109: Identifying Heterogeneous Treatment Effects When Treatment Timing Is Latent: A Two-Stage Approach Using Change-Point Detection and Staggered Difference-in-Differences

Author: Tom Hackenberg, Mathematics and Computer Science, TU Eindhoven, The Netherlands

Abstract

Standard diff-in-diff assumes treatment timing is administratively determined. When feature launches trigger gradual adoption, conflating rollout with unit-specific treatment onset generates severe attenuation bias that masks heterogeneous treatment effects. We formalize this as a latent treatment timing problem: users adopt at heterogeneous rates, but standard DiD assigns treatment uniformly at launch. We develop a two-stage identification strategy. First, change-point detection anchored to the exogenous launch identifies unit-specific adoption timing from consumption patterns, recovering latent onset Gi. Second, Callaway-Sant’Anna staggered DiD uses estimated treatment dates to estimate group-specific effects while preserving parallel trends. Applying this to article pageviews from a Dutch media platform’s content vertical launch, naive DiD produces a null result (ATT = -1.24, p=0.51), obscuring all heterogeneity. Our approach reveals variation across content verticals, with substitution effects from LATE = -2.3 to -12.8 pageviews/day. This framework generalises to contexts where treatment diffuses endogenously: digital platform features, technology adoption, policy rollouts, or behavioural interventions with gradual diffusion.

T110: Harnessing hybrid control designs to fill knowledge gaps in drug development: applications of transportability and longitudinal modeling

Author: Tim Morris, Novartis Pharma AG/Statistics Methodology, Switzerland

Abstract

In drug development, randomized clinical trials are designed to answer comparative questions on clinical outcomes essential for internal or regulatory decision making. Nevertheless, additional questions arise in internal decision making or health technology assessment dossiers that cannot be answered using the trial data or main randomized period alone. Those questions relate to the treatment effect for longer term outcomes than what was observed in the main trial period, or for a broader population than what was recruited in the comparator arm in the trial. Hybrid control designs, augmenting internal controls in the randomized trial with external data, have the potential to help answer some of these questions. Causal inference methods can formalize and help address these questions.

This presentation will motivate this problem with a few case studies from drug development. Then, review causal inference methods in transportability and longitudinal modeling to answer comparative questions beyond the main trial period or population. The presentation will also discuss practical considerations for implementation, planning, feasibility, and reporting using these methods and interpreting the results.

Coauthors: Izem, Rima (Novartis), Dunn, Robin (Novartis); Lyu, Tianmeng (Novartis); Tian, Yuan (Novartis)

T111: Randomization Inference with Concentration Inequalities

Author: Tobias Freidling, Institute of Mathematics, École polytechnique fédérale de Lausanne, Switzerland

Abstract

Randomization inference is one of the most reliable methods of analysing experimental data as it does not require any potentially erroneous modelling assumptions. Yet, two common criticisms hinder a more widespread adoption: (1) The classical Fisher’s null hypothesis of “no effect whatsoever” is often deemed of little interest; (2) Randomization inference does not provide external validity beyond the investigated units.

In this work, we address both criticisms via the use of concentration inequalities and thus do not need to rely on asymptotic arguments. To estimate the sample average treatment effect (SATE), we employ the Horvitz-Thompson estimator and construct confidence intervals by inverting novel Hoeffding- or Bernstein-type concentration inequalities. For external validity, we extend this approach to the population average treatment effect (PATE) where we need to additionally account for the randomness in the recruitment/sampling of units from a finite super-population. In this work, we primarily focus on (stratified) Bernoulli trials and completely randomized experiments but also discuss extensions to adaptive treatment assignment schemes and confidence sequences.

T112: A Stable and Bounded Estimator for Generalizing Conditional Average Treatment Effects via i-TMLE

Author: Vanessa Rodriguez, Statistical Science, UCL, UK

Abstract

Unrepresentative trial samples can bias Conditional Average Treatment Effect (CATE) estimates when generalizing to target populations. Current strategies, such as the DR-learner, regress inverse-weighted pseudo-outcomes, often yielding estimates outside the valid bounds of the treatment effect, especially with small trial participation probabilities.

We extend a targeted learning estimator (i-TMLE) that creates a pseudo-outcome which lies within the space of the outcome. This estimator is oracle efficient under reasonable conditions and its stability is not affected by small sampling probabilities. We utilize the Highly Adaptive Lasso (HAL) for the second-stage regression, a nonparametric estimator whose fast convergence rates ensure pointwise asymptotic normality. This enables the construction of valid confidence intervals for non-pathwise differentiable parameters such as the CATE.

We validate finite-sample performance via simulations involving overlap violations between the covariates of the sample and the target population. We then illustrate the method by generalizing cardiovascular outcomes from the PIONEER 6 trial to the SOUL population. Results show that i-TMLE achieves greater stability than the DR-Learner.

Coauthors: Brieuc Lehmann, Karla Diaz-Ordaz

T114: Confounder-adjusted statistical association representation through partial copulas

Author: Vinícius Litvinoff Justus, Statistics department, Carlos III University of Madrid, Spain

Abstract

In this work, we revisit the notion of partial copula, originally introduced to test conditional independence, highlighting its capability to represent the dependence between two random variables after removing their dependence with a covariate. In particular, we discuss the case where this covariate is a confounder. We give a formal proof for an analytical representation of the partial copula previously discussed in the literature, and then we use it to prove new results showing how the dependence between a treatment and an outcome (conditionally to a confounder) constrain the form of the partial copula. Finally, a simulation study is conducted to illustrate the results and to show the potential of partial copula as a way to describe covariate-adjusted statistical dependence. This highlights the potential of the method to remove confounding effects and to recover the true signal of a causal effect.

Coauthors: Felipe Fontana Vieira (Ghent University, Belgium)

T115: Individualised treatment effects estimation with composite treatments and composite outcomes

Author: Vinod Kumar Chauhan, Computer and Information Sciences, University of Strathclyde, UK

Abstract

Estimating individualised treatment effect (ITE) - that is the causal effect of a set of variables, referred to as composite treatments, on a set of outcome variables of interest, referred to as composite outcomes, for a unit from observational data - remains a fundamental problem in causal inference with applications across disciplines. Previous work in ITE estimation is limited to simple settings, like single treatments and single outcomes. This hinders their use in complex real-world scenarios; for example, consider studying the effect of different ICU interventions, such as beta-blockers and statins for a patient admitted for heart surgery, on different outcomes of interest such as atrial fibrillation and in-hospital mortality. The limited research into composite treatments and outcomes is primarily due to data scarcity for all treatments and outcomes. To address the above challenges, we propose a novel and innovative hypernetwork-based approach, called H-Learner, to solve ITE estimation under composite treatments and composite outcomes, which tackles the data scarcity issue by dynamically sharing information across treatments and outcomes. Our empirical analysis demonstrates the effectiveness of the proposed approach.

Coauthors: Lei Clifton, Gaurav Nigam and David A Clifton from the University of Oxford UK

T116: Calibrated Debiased Machine Learning for Parameters without Mixed Bias Property

Author: Willem Weyens, Ghent University

Abstract

Debiased machine learning procedures have become very popular for inferring treatment effects from randomized experiments and observational studies. Current state-of-the-art methods necessitate specific conditions on the convergence rates of nuisance parameter estimators used in these procedures. While these are relatively weak, they may fail to hold in practice. Calibrated debiased machine learning procedures (Van der Laan, Luedtke and Carone, 2025) seriously weaken these conditions, resulting in significant improvements in empirical behaviour. But their framework is limited to parameters with a so-called mixed-bias property' (for which a double robust estimator exists). We relax this limitation by developing calibrated debiased machine learning procedures for the exposure effect parameters indexing assumption-lean linear regression models (Vansteelandt and Dukes, 2022; Vansteelandt, 2025), which do not have amixed-bias property’.

Coauthors: Oliver Dukes, Máté Kormos and Stijn Vansteelandt, all from Ghent university

T121: Step 0: Data assembly for target trial emulation

Author: William Hulme, Bennett Institute for Applied Data Science, University of Oxford, UK

Abstract

Routinely-collected healthcare data are increasingly used to answer causal questions. The target trial framework guides researchers in reducing design-related biases when analyzing such observational data. However, even rigorously designed emulations may produce biased results if the foundational step of data assembly is flawed. In particular, the process of discretizing longitudinal data into person-intervals is often underemphasized, poorly documented, and susceptible to temporal misalignment - a source of avoidable bias.

To address this gap, we outline core principles for preserving the correct temporal ordering of covariates, treatments, and outcomes when mapping raw longitudinal records into a discrete-time analytic dataset. We then demonstrate this approach via a case study of metformin for the prevention of COVID-19-related hospitalization or death among individuals with type 2 diabetes, using electronic health records from England.

This work provides a practical framework, including step-by-step guidance and accompanying R code, for data assembly (the “Step 0” of executing a target trial emulation), thereby strengthening the validity of target trial emulation using routinely-collected health data.

Coauthors: Alain Amstutz1,2,3, Paul Madley-Dowd1, Will Hulme4, Venexia Walker1, Tom Palmer1, Michail Katsoulis5, Rohan Takhar6, Emma E. McGee7,8,9, Rachel Denholm1, Reecha Sofat5, Miguel A. Hernan7,8,9,10, Jonathan Sterne1, Barbra A. Dickerman7,8,9

1 Electronic Health Records Group, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom 2 Oslo Centre for Biostatistics & Epidemiology, Oslo University Hospital, Oslo, Norway 3 Division of Clinical Epidemiology, Department of Clinical Research, University Hospital Basel and University of Basel, Basel, Switzerland 4 Bennett Institute for Applied Data Science, Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom 5 University College London, London, United Kingdom 6 University of Liverpool, Liverpool, United Kingdom 7 CAUSALab, Harvard T.H. Chan School of Public health, Boston, MA, USA 8 Zhu Family Center for Global Cancer Prevention, Harvard T.H. Chan School of Public health, Boston, MA, USA 9 Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA 10 Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA

T123: Incorporating uncertainty about the validity of adjustment sets in the combination of observational and experimental data to estimate causal effects

Author: Zhongyi Hu, Vrije Universiteit

Abstract

Combining estimators from different data sets is a rising topic in causal inference. Often the estimator is either unbiased with high variance or has low variance but unknown bias. In this paper, we study how to combine an unbiased estimator with many possibly biased estimators while explicitly incorporating uncertainty about the bias. We provide theoretical upper bounds for the mean squared error (MSE), analysis on consistency when variance/covariance of estimators converge at different rates. Most of the literature has focused on one biased estimator, while we explore the possible advantages of using many of them. In addition, estimators often come from regression with different adjustment variables (sets). In a Bayesian setting, each adjustment set has a probability of being valid. We incorporate these probabilitiesinto our estimator and show the theoretical and practical advantage of our method. We also perform simulated experiments to make a comparison with other methods.

Coauthors: Zhong yi Hu; Stephanie van der Pas

T124: caugi - Causal Graph Interface

Author: Bjarke Hautop, Department of Biostatistics, University of Copenhagen, Denmark

Abstract

Causal graphs are fundamental tools in causal inference, yet their representation has historically relied on generic graph packages or ad-hoc solutions. These approaches lack native support for causal-specific edge types and graph classes, leading to verbose, error-prone code and ambiguous semantics.

We introduce caugi (Causal Graph Interface) – a causality-first R package with a high-performance Rust backend. caugi provides an intuitive syntax for constructing causal graphs and supports multiple graph classes including DAGs, PDAGs, ADMGs, AGs, PAGs, and undirected graphs with the possibility of defining new edge types for research.

Benchmarks demonstrate that caugi outperforms existing R packages (igraph, bnlearn, dagitty, ggm) for relational queries often by orders of magnitude in almost every case. The package provides comprehensive functionality including separation, adjustment, projection, marginalization, conditioning, simulation, relational queries, metrics, plotting, importing from and exporting to other formats.

caugi addresses a critical gap in the R ecosystem by providing a performant, intuitive, safe, and expressive interface for causal graphs. The package is freely available on CRAN.

Link: caugi.org

Coauthors: Frederik Fabricius-Bjerre, Johan Larsson and Michael Sachs

T125: Correct but Useless? When Valid Causal Estimates Fail to Inform Real-World Decisions

Author: Roman Torgovitsky, Independent Scientist / Veritas Derisk Executive Advisory LLC

Abstract

Benefiting real-world patients requires not only robust study design and valid causal inference methods, but also careful interpretation of the resulting effect estimates. In modern epidemiology, however, methodological rigor is often treated as an end in itself, under the implicit assumption that a valid causal estimand automatically yields useful insight into disease mechanisms or treatment effectiveness. Using depression as a motivating example, we illustrate how formally valid causal estimates may fail to support meaningful interpretation; how reducing psychiatric research to diagnostic labels can obscure clinically relevant heterogeneity and generate misleading conclusions; and how studying physical disease while ignoring the adaptive nature of the human organism and the role of systemic physiological context can slow scientific progress. We argue that methodological validity is necessary but not sufficient for producing actionable knowledge, and that greater attention to outcome construction, context, and interpretation is essential for translating causal estimates into real-world benefit.

T127: Deriving Complete Constraints in Causal Models with Hidden Variables

Author: Michael C Sachs, Section of Biostatistics and the Pioneer Centre for SMARTbiomed, University of Copenhagen

Abstract

Hidden variable graphical causal models can sometimes imply constraints on the observable distribution that are more complex than simple conditional independence relations. These observable constraints can allow for falsification of assumptions of the model that would otherwise be untestable due to the unobserved variables and can be used to constrain estimation procedures to improve efficiency. Knowing the complete set of observable constraints is thus ideal, but this can be difficult to determine in many settings. In models with categorical observed variables and a joint distribution that is completely characterized by linear relations to the unobservable response function variables, we develop a systematic method for deriving the complete set of observable constraints. The method applies to a broad class of causal models, and there are several transformations we can exploit to obtain necessary (but possibly not complete) constraints even in settings that are not linear. We illustrate the method in several new settings, including ones that imply both inequality and equality constraints.

Coauthors: Erin E Gabriel, Section of Biostatistics and the Pioneer Centre for SMARTbiomed, University of Copenhagen; Robin J Evans, Department of Statistics, University of Oxford; Arvid Sjölander, Department of Medical Epidemiology and Biostatistics, Karolinska Institute