Skip to main content

Table 3 Policy Evaluation: Mean Average Error (\(\mu\) ± \(\sigma\)) of inverse propensity scoring (IPS)-based estimators. Optdigits and Letter are two multiclass classification datasets from the UCI repository [30]. LR Logistic Regression, BNN Bayesian Neural Network, NN Neural Network

From: Clinical decision making under uncertainty: a bootstrapped counterfactual inference approach

Dataset

Expert Policy

IPS(\(h^{true}_0\))

\(\hat{h}_0\) - NN

\(\hat{h}_0\) - BNN

Vanilla IPS

NN Ensemble

Vanilla IPS

BNN (Variational Inf.)

MC-Dropout

IPS\(_{inv}\)

IPS\(_{avg}\)

IPS\(_{inv}\)

IPS\(_{avg}\)

IPS\(_{avg}\)

UCI

OPTDIGITS (10 actions)

4.7 ± 0.6

29.1 ± 11.9

21.2 ± 8.9

3.5 ± 0.4

8.8 ± 25.4

6.0 ± 4.9

5.7 ± 0.6

3.2 ± 0.7

LETTER (26 actions)

22.9 ± 0.5

2.0 ± 1.5

1.5 ± 1.0

3.9 ± 0.9

23.3 ± 8.2

24.8 ± 2.3

33.1 ± 1.2

516.3 ± 6.8

Warfarin

LR (3 actions)

28.3 ± 1.1

47.6 ± 0.9

46.3 ± 3.6

47.8 ± 0.8

228.6 ± 193.4

308.0 ± 79.1

43.6 ± 1.1

12.5 ± 1.1

LR (5 actions)

41.7 ± 0.9

66.4 ± 38.0

56.4 ± 11.9

62.8 ± 1.0

824.6 ± 419.2

547.9 ± 148.9

48.0 ± 17.6

17.8 ± 1.3

PHARMA (3 actions)

16.5 ± 1.8

13.7 ± 9.1

17.3 ± 5.0

20.7 ± 1.2

13.6 ± 4.8

43.9 ± 21.2

17.1 ± 2.4

4.0 ± 1.5

PHARMA (5 actions)

11.4 ± 2.7

14.3 ± 6.2

11.0 ± 5.1

19.7 ± 2.6

58.6 ± 111.6

13.0 ± 8.3

15.0 ± 9.2

12.1 ± 1.1