Post hoc inference for genomics and neuroimaging

based on joint work with: G. Blanchard, E. Roquain
A. Blain, N. Enjalbert Courrech, B. Thirion

CNRS & Institut de Mathématiques de Toulouse

Foreword: PCI Stat ML and Computo

PCI Statistics and Machine Learning

  • Peer Community In (PCI): thematic platforms for preprint reviews
    • preprints are reviewed and a recommendation is published
    • highway to publication in PCJ or PCI friendly journals
  • PCI Statistics and Machine Learning
    • a thematic PCI for our community, created in December 2025
    • contributions and recommenders welcome !

Computo: a journal in statistics and Machine Learning promoting reproducibility

  • embedded reproducibility: quarto notebook, rendered in html and pdf
  • welcomes “negative” results, benchmarks, case studies
  • Example: Gimenez et al. (2022)

Take home messages for this talk

  • Can be obtained for conditional inference using Knockoffs
  • Commonly used Knockoff generators are invalid

From multiple testing to post hoc inference

Ex1: detection of active brain regions in neuroimaging

Find voxels whose average activity differs between two groups of samples

  • “Word vs face” contrast: find regions with higher signal for word than face images
  • \(m=52,000\) voxels, \(n0=n1=25\) subjects

Ex2: differential expression in genomics

Find genes whose average activity differs between two groups of samples

  • Leukemia gene expression data (Chiaretti et. al., Clin. cancer res, 2005)
  • \(m=12,625\) genes, \(n_0 = 42\) NEG patients vs \(n_1=37\) BCR/ABL patients

FDR control for large-scale multiple testing

Strategy: one test for each feature (gene/voxel) + choose significance threshold

State of the art: False Discovery Rate control (Benjamini and Hochberg (1995))

\(\mathrm{FDR} = \mathbb{E}(\mathrm{FDP}),\) where FDP = proportion of false discoveries (random)

FDR control is not FDP control

FDR: not robust to post hoc selection

Post hoc inference1

  • \(\mathcal{H}=\{1, \dots m\}\) null hypotheses
  • \(\mathcal{H}_0 \subset \mathcal{H}\): true null hypotheses
  • For \(S \subset \mathcal{H}\), \(|S \cap \mathcal{H}_0|\) : number of “false positives” within \(S\)

Definition: post hoc bound

For a given \(\alpha\) in \((0,1)\), find \(V_\alpha\) such that

\[\mathbb{P} \left( \textcolor{red}{\forall S \subset \mathcal{H}},\:\:\: |S \cap \mathcal{H}_0| \leq V_\alpha(S) \right) \geq 1-\alpha\]

Important example: Simes bound2

\[V_\alpha(S) = \min_{1\leq k \leq |S|} (k-1) + \sum_{i \in S} \mathbb{1}\{p_i \geq \alpha k /m\}\]

FDR control \(\approx\) post hoc inference with \(\alpha=1/2\)

Neuroimaging data: glass brain plot (Simes/ARI)

True Discovery Proportion (TDP): TDP = 1-FDP

\[\mathbf{\rm TDP} \geq 0.4\]

\[\mathbf{\rm TDP} \geq 0.6\]

Genomic data: volcano plot (Simes/ARI)

Post hoc inference via Joint Error Rate control

Blanchard, Neuvial, and Roquain (2020)

Post hoc inference via Joint Error Rate control

\[\textrm{Goal: } V_\alpha \quad s.t. \quad\quad \mathbb{P} \left( \textcolor{red}{\forall S \subset \mathcal{H}},\:\:\: |S \cap \mathcal{H}_0| \leq V_\alpha(S) \right) \geq 1-\alpha\]

Joint Error Rate (JER)

Let \(\mathbf{t} = (t_k)_{k}\) be a non-decreasing sequence and \(R_k = \{i \in \mathcal{H}, p_i \leq t_k\}\)

\(\qquad\qquad\qquad\qquad JER(\mathbf{t}) := \mathbb{P} \big(\exists k \in\{1,\dots,p_0\} \::\: p_{(k:\mathcal{H}_0)} < t_k \big)\)

JER control yields valid post hoc bounds

  • \(JER(\mathbf{t}) \leq \alpha \quad\quad \Leftrightarrow \quad\quad \mathbb{P} \left( \textcolor{red}{\forall k},\:\:\: |R_k \cap \mathcal{H}_0| \leq k-1 \right) \geq 1-\alpha\)

  • yields \((1-\alpha)\)-level post hoc bound: \(V_\alpha(S) = \min_{1\leq k \leq |S|} (k-1) + \sum_{i \in S} \mathbb{1}\{p_i \geq t_k\}\)

Recovers Simes post hoc bound for \(t_k = \alpha k / m\)

JER calibration1: highway to powerful post hoc inference

Find \(\mathbf{t} = (t_k)_{k}\) such that \(\mathbb{P}\left( \exists k, p_{(k:\mathcal{H}_0)} \leq t_k \right) \leq \alpha\)

Genomic data: volcano plot (Parametric Simes1)

Genomic data: volcano plot (Semiparametric Simes1)

IIDEA: R/shiny app for interactive differential analysis1

KOPI: FDP control for conditional inference

Blain et al. (2023)

Controlled variable selection

  • \(X_j \in \mathbb{R}^n\): brain activity data for each voxel \(j=1 \dots p\)
  • \(Y \in \mathbb{R}^n\): outcome of interest

Goal: select a subset of variables significantly associated with \(Y\).

how to quantify “association” between \(X_j\) and \(Y\)?

  • marginal association: \(Y \perp\!\!\!\perp X_j\)
  • conditional association: \(Y \perp\!\!\!\perp X_j | X_{-j}\)

Standard approach to conditional association: Knockoffs (Candès et al. 2018)

Knockoffs in a nutshell

Knockoffs provide non asymptotic FDR control

For irrelevant variables, the sign of the \(W_j\)’s are independent coin flips, conditional on \(|W|\)

Knockoffs viewed by Candès

Source: https://web.stanford.edu/group/candes/knockoffs/

JER control in the knockoff framework

FDR control for KO in terms of “\(\pi\)-statistics” (Nguyen et al. 2020)

\(\pi_{j}= \frac{1+Z_j}{p} \mathbb{1}_{W_{j} > 0} + \mathbb{1}_{W_{j} \leq 0}\qquad \qquad\) where \(Z_j = \left\vert\left\{k: W_{k} \leq-W_{j}\right\}\right\vert\).

Associated JER: \(\quad \quad JER(\mathbf{t}) = \mathbb{P} \big(\exists k \in\{1,\dots,p_0\} \::\: \pi_{(k:\mathcal{H}_0)} < t_k \big)\)

Key idea (already in Candès et al. (2018))

Stochastic domination of the joint distribution of \((\pi_j)_{1 \leq j \leq p}\) by a known distribution.

\(\pi^0_{j}= \frac{1+Z_j}{p} \mathbb{1}_{\chi^0_{j} = 1} + \mathbb{1}_{\chi^0_{j} = -1}\)

  • \(Z^0_j = \left\vert\left\{k \in [j-1], \chi^0_k = -1 \right\}\right\vert\)
  • \(\chi^0=(\chi^0_j)_{1 \leq j \leq p}\) be a collection of \(p\) i.i.d. Rademacher variables

Contributions (Blain et al. 2023)

  • Calibration: choose \(\mathbf(t)\) such that \(JER(\mathbf{t}) \leq JER^0(\mathbf{t}) \leq \alpha\)
  • Aggregation to mitigate the randnomness associated to KO generation

Numerical results on semi-simulated data

  • neuroimaging data from Human Connectome Project (HCP): 7 contrasts/tasks
    • \(n=778\) subjects, \(p= 156,374\) reduced to \(1,000\) by Ward parcellation
  • \(\mathbf{Y} = \mathbf{X}\boldsymbol{\beta}^* + \sigma\boldsymbol{\epsilon}\), with \(X\) = dataset 1 and \(\boldsymbol{\beta}^*\) estimated from dataset 2

Empirical FDP and power for 42 contrast pairs (only CT aims for posthoc FDP control1

Results on HCP data

Detections for the SOCIAL contrast (Social motion vs random motion)

image image image image

only classical KO (FDR \(< 0.2\)) and KOPI (FDP \(< 0.2\) with proba 0.9) yield discoveries

Blind spot: knockoff generation

Blain et al. (2025)

Construction of valid knockooffs

Assuming that \(X \sim \mathcal{N}(0, \Sigma)\), Candès et al. (2018) provide a valid construction of knockoffs such that:

\[ [X, \tilde{X}] \sim \mathcal{N}(0, \mathbf{G}), \quad \text { where } \mathbf{G}=\left(\begin{array}{cc} \boldsymbol{\Sigma} & \boldsymbol{\Sigma}-\operatorname{diag}(s) \\ \boldsymbol{\Sigma}-\operatorname{diag}(s) & \boldsymbol{\Sigma} \end{array}\right) \]

This requires the knowledge of \(\Sigma\)

  • In practice \(\Sigma\) is unknown, and its estimation is a very difficult problem when \(p>n\)
  • R and python implementations rely on the Graphical lasso

Failure of Gaussian knockoffs

Simulation setup:

  • \(n=500\), \(p=500\), \(X\) gaussian with spatial structure
  • \(Y = X \beta^* + \sigma \epsilon\) with \(\beta \in \{0,1\}^p\) and \(\Vert \beta^* \Vert_0=0.1 p\)

Proposed approach for knockoff generation

Goals

  • avoid covariance matrix estimation
  • maintain low computational cost

Proposed algorithm (adapted from Candès et al. (2018))

Improved knockoff generation

No obvious violation of exchangeability for nonparametric knockoffs!

Diagnosing Knockoffs exchangeability

Tool: Classifer Two-Sample Test 1

Diagnosing Knockoffs exchangeability: results

Conclusion

  • Post hoc inference = powerful & intepretable alternative to FDR control
    • can be obtained by JER control…
    • … which can be obtained when joint null distribution of test statistics is known or can be sampled from
  • sanssouci: available R and python implementation + shiny app

Some other developments on JER control

Ongoing applications

  • gene-set differential analysis of scRNA-seq (PhD thesis of Sara Fallet)
  • Hi-C data (PhD thesis of Elise Jorge)

Appendix

Computing the interpolation bound

\[V_\alpha(S) = \min_{1\leq k \leq |S|} (k-1) + \sum_{i \in S} \mathbb{1}\{ p_i \geq t_k \}\]

How to obtain sharper JER control?

Step 1: Semiparametric JER control

Given a family of functions \((t_k)_k\) (e.g. \(t_k(\lambda) = \lambda k/m\)), estimate from the data the largest \(\lambda\) such that

\[\mathbb{P}\left( \exists k, p_{(k:\mathcal{H}_0)} \leq t_k(\lambda) \right) \leq \alpha\]

Step 2: Nonparametric JER control

Estimate from the data the “largest” \(\mathbf{t} = (t_k)_{k}\) such that

\[\mathbb{P}\left( \exists k, p_{(k:\mathcal{H}_0)} \leq t_k \right) \leq \alpha\]

JER control by randomization: main result

Randomization assumption (Rand)

There exists a finite group of transformations \(\mathcal{G}\) such that

\[\forall g \in\mathcal{G},\:\: (p_{\mathcal{H}_0}(g'.X))_{g'\in \mathcal{G}} \sim (p_{\mathcal{H}_0}(g'.g.X))_{g'\in \mathcal{G}}\]

Notation

  • \(\Psi(X)= \min_{1\leq k \leq K}\left\{ t_k^{-1}\left(p_{(k)}(X)\right)\right\}\)
  • \(\lambda(\alpha) =\max\bigg\{ \lambda \geq 0\:: B^{-1}\sum_{j=1}^B 1\{ \Psi(g_j.X) < \lambda\} \leq \alpha \bigg\}\)

Theorem (Blanchard, Neuvial, and Roquain (2020))

Under (Rand), \(R_k := \{i: p_i \leq t_k(\lambda(\alpha))\}\) is a JER controlling family

Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society: Series B (Methodological) 57 (1): 289–300.
Benjamini, Yoav, and Daniel Yekutieli. 2001. “The Control of the False Discovery Rate in Multiple Testing Under Dependency.” Annals of Statistics, 1165–88.
Blain, Alexandre, Angel Reyero Lobo, Julia Linhart, Bertrand Thirion, and Pierre Neuvial. 2025. “When Knockoffs Fail: Diagnosing and Fixing Non-Exchangeability of Knockoffs.” https://doi.org/10.48550/arXiv.2407.06892.
Blain, Alexandre, Bertrand Thirion, Olivier Grisel, and Pierre Neuvial. 2023. “False Discovery Proportion Control for Aggregated Knockoffs.” Advances in Neural Information Processing Systems 36.
Blain, Alexandre, Bertrand Thirion, and Pierre Neuvial. 2022. “Notip: Non-Parametric True Discovery Proportion Control for Brain Imaging.” NeuroImage 260: 119492.
———. 2025. False Coverage Proportion Control for Conformal Prediction.” In International Conference on Machine Learning. Vancouver (BC), Canada. https://hal.science/hal-05134749.
Blanchard, Gilles, Pierre Neuvial, and Etienne Roquain. 2020. “Post Hoc Confidence Bounds on False Positives Using Reference Families.” Annals of Statistics 48 (3): 1281–1303.
Candès, Emmanuel, Yingying Fan, Lucas Janson, and Jinchi Lv. 2018. “Panning for Gold:‘model-x’knockoffs for High Dimensional Controlled Variable Selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80 (3): 551–77.
Davenport, Samuel, Bertrand Thirion, and Pierre Neuvial. 2025. “FDP Control in Mass-Univariate Linear Models Using the Residual Bootstrap.” Electron. J. Statist. 19 (1): 1313–36. https://doi.org/10.1214/25-EJS2354.
Enjalbert Courrech, Nicolas, and Pierre Neuvial. 2022. Powerful and interpretable control of false discoveries in two-group differential expression studies.” Bioinformatics 38 (23): 5214–21.
Gimenez, Olivier, Maëlis Kervellec, Jean-Baptiste Fanjul, Anna Chaine, Lucile Marescot, Yoann Bollet, and Christophe Duchamp. 2022. “Trade-Off Between Deep Learning for Species Identification and Inference about Predator-Prey Co-Occurrence.” Computo, April. https://doi.org/10.57750/yfm2-5f45.
Goeman, Jelle J, and Aldo Solari. 2011. “Multiple Testing for Exploratory Research.” Statistical Science 26 (4): 584–97.
Gretton, Arthur, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. “A Kernel Two-Sample Test.” The Journal of Machine Learning Research 13 (1): 723–73.
Li, Jinzhou, Marloes H Maathuis, and Jelle J Goeman. 2024. “Simultaneous False Discovery Proportion Bounds via Knockoffs and Closed Testing.” JRSS-B.
Nguyen, Tuan-Binh, Jérôme-Alexis Chevalier, Bertrand Thirion, and Sylvain Arlot. 2020. “Aggregation of Multiple Knockoffs.” In International Conference on Machine Learning, 7283–93. PMLR.
Rosenblatt, Jonathan D, Livio Finos, Wouter D Weeda, Aldo Solari, and Jelle J Goeman. 2018. “All-Resolutions Inference for Brain Imaging.” Neuroimage 181: 786–96.
Simes, R John. 1986. “An Improved Bonferroni Procedure for Multiple Tests of Significance.” Biometrika 73 (3): 751–54.