Estimate Counterfactual Mean Squared Error

Estimates the mean squared error (MSE) of a prediction model under a hypothetical intervention where treatment is set to a specific level.

Usage

cf_mse(
  predictions,
  outcomes,
  treatment,
  covariates,
  treatment_level = 0,
  estimator = c("dr", "cl", "ipw", "naive"),
  propensity_model = NULL,
  outcome_model = NULL,
  outcome_type = c("auto", "binary", "continuous"),
  se_method = c("bootstrap", "influence", "none"),
  n_boot = 500,
  conf_level = 0.95,
  boot_ci_type = c("percentile", "normal", "basic"),
  cross_fit = FALSE,
  n_folds = 5,
  parallel = FALSE,
  ncores = NULL,
  ps_trim = NULL,
  ...
)

Arguments

predictions

Numeric vector of model predictions.

outcomes

Numeric vector of observed outcomes.

treatment

Numeric vector of treatment indicators (0/1).

covariates

A matrix or data frame of baseline covariates (confounders).

treatment_level

The counterfactual treatment level (default: 0).

estimator

Character string specifying the estimator:

"naive": Naive estimator (biased)
"cl": Conditional loss estimator
"ipw": Inverse probability weighting estimator
"dr": Doubly robust estimator (default)

propensity_model

Optional fitted propensity score model. If NULL, a logistic regression model is fit using the covariates.

outcome_model

Optional fitted outcome model. If NULL, a regression model is fit using the covariates among treated/untreated. For binary outcomes, this should be a model for E[Y|X,A] (binomial family). For continuous outcomes, this should be a model for E[L|X,A] (gaussian family).

outcome_type

Character string specifying the outcome type:

"auto": Auto-detect from data (default)
"binary": Binary outcome (0/1) - uses efficient transformation
"continuous": Continuous outcome - models loss directly

se_method

Method for standard error estimation:

"bootstrap": Bootstrap standard errors (default)
"influence": Influence function-based standard errors
"none": No standard error estimation

n_boot

Number of bootstrap replications (default: 500).

conf_level

Confidence level for intervals (default: 0.95).

boot_ci_type

Type of bootstrap confidence interval to compute:

"percentile": Percentile method (default)
"normal": Normal approximation using bootstrap SE
"basic": Basic bootstrap interval

cross_fit

Logical indicating whether to use cross-fitting for nuisance model estimation (default: FALSE). Cross-fitting enables valid inference when using flexible machine learning estimators.

n_folds

Number of folds for cross-fitting (default: 5).

parallel

Logical indicating whether to use parallel processing for bootstrap (default: FALSE).

ncores

Number of cores for parallel processing (default: NULL, which uses all available cores minus one).

ps_trim

Propensity score trimming specification. Controls how extreme propensity scores are handled. Can be:

NULL (default): Uses absolute bounds c(0.01, 0.99)
"none": No trimming applied
"quantile": Quantile-based trimming with default c(0.01, 0.99)
"absolute": Explicit absolute bounds with default c(0.01, 0.99)
A numeric vector of length 2: c(lower, upper) absolute bounds
A single numeric: Symmetric bounds c(x, 1-x)
A list with method ("absolute"/"quantile"/"none") and bounds

...

Additional arguments passed to internal functions.

Value

An object of class c("cf_mse", "cf_performance") containing:

estimate: Point estimate of counterfactual MSE
se: Standard error (if computed)
ci_lower: Lower confidence interval bound
ci_upper: Upper confidence interval bound
estimator: Estimator used
naive_estimate: Naive MSE for comparison
n_obs: Number of observations
treatment_level: Counterfactual treatment level

Details

The function implements four estimators for the counterfactual MSE:

Naive Estimator: Simply computes the empirical MSE using observed outcomes. This is biased for the counterfactual estimand when treatment affects outcomes.

Conditional Loss (CL) Estimator: Models the expected loss conditional on covariates and treatment, then marginalizes. Requires a correctly specified outcome model.

IPW Estimator: Weights observations by the inverse probability of receiving the counterfactual treatment. Requires a correctly specified propensity score model.

Doubly Robust (DR) Estimator: Combines CL and IPW approaches. Consistent if either the propensity or outcome model is correctly specified.

References

Boyer, C. B., Dahabreh, I. J., & Steingrimsson, J. A. (2025). "Estimating and evaluating counterfactual prediction models." Statistics in Medicine, 44(23-24), e70287. doi:10.1002/sim.70287

Examples

# Generate example data
set.seed(123)
n <- 500
x <- rnorm(n)
a <- rbinom(n, 1, plogis(-0.5 + 0.5 * x))
y <- rbinom(n, 1, plogis(-1 + x - 0.5 * a))
pred <- plogis(-1 + 0.8 * x)  # Predictions from some model

# Estimate counterfactual MSE under no treatment
result <- cf_mse(
  predictions = pred,
  outcomes = y,
  treatment = a,
  covariates = data.frame(x = x),
  treatment_level = 0,
  estimator = "dr",
  se_method = "none"  # Skip SE for speed in example
)
print(result)
#> 
#> Counterfactual MSE Estimation
#> ---------------------------------------- 
#> Estimator: dr 
#> Treatment level: 0 
#> N observations: 500 
#> 
#> Estimate: 0.1754
#> 
#> Naive estimate: 0.1689 
#>