paint-brush
Transductive Conformal Inference With Adaptive Scores: Additional Experimentsby@transduction

Transductive Conformal Inference With Adaptive Scores: Additional Experiments

by Transduction University PapersFebruary 28th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks.
featured image - Transductive Conformal Inference With Adaptive Scores: Additional Experiments
Transduction University Papers HackerNoon profile picture

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Ulysse Gazin, Universit´e Paris Cit´e and Sorbonne Universit´e, CNRS, Laboratoire de Probabilit´es, Statistique et Mod´elisation,

(2) Gilles Blanchard, Universit´e Paris Saclay, Institut Math´ematique d’Orsay,

(3) Etienne Roquain, Sorbonne Universit´e and Universit´e Paris Cit´e, CNRS, Laboratoire de Probabilit´es, Statistique et Mod´elisation.

H Additional experiments

In this section, we provide experiments to illustrate the FDP confidence bounds for AdaDetect, as mentioned in Remark 4.2 and Section G.


The two procedures used are of the AdaDetect type (48) but with two different score functions: the Random Forest classifier from Marandon et al. (2022) (adaptive score), and the one class classifier Isolation Forest as in Bates et al., 2023 (non adaptive score). The hyperparameters of these two machine learning algorithms are those given by Marandon (2022).


The FDP and the corresponding bounds are computed for the two procedures. The true discovery proportion is defined by



Following the numerical experiments of Marandon et al. (2022) and Bates et al. (2023), we consider the three different real data from OpenML dataset (Vanschoren et al., 2013) given in Table 1.


The results are displayed in Figure 6 for comparison of adaptive versus non-adaptive scores for the different FDP confidence bounds and the TDP. On Figure 7, we focus on the adaptive scores and corresponding FDP bounds only; we compare the effect (on the bounds) of demanding a more conservative error guarantee (δ = 0.05 versus δ = 0.2), as well as the effect of estimating m0 via (25) instead of just using the inequality (24) with ˆm0 = m.


The high-level conclusions are the following:


Table 1: Summary of datasets.



• using adaptive scores rather that non-adaptive ones results in a performance improvement (better true discovery proportion for the same target FDR level)



• estimating the estimator ˆm0 from (25) yields sharper bounds on the FDP and is therefore advantageous.