cfperformance 0.5.0
Adds factual (non-counterfactual) prediction model transportability, configurable propensity score trimming, improved standard error methods, and critical bug fixes.
New Features
Factual Prediction Model Transportability
- All transportability functions (
tr_mse(),tr_auc(),tr_sensitivity(),tr_specificity(),tr_fpr(),tr_roc(),tr_calibration()) now support factual mode whentreatment = NULL. - Factual mode estimates prediction model performance in the target population for observed (factual) outcomes, without requiring treatment/intervention data.
- This enables standard prediction model transportability analysis (covariate shift correction) without counterfactual assumptions.
- Print/summary methods display “Factual” or “Counterfactual” mode labels.
Propensity Score Trimming
- New
ps_trimparameter for allcf_*andtr_*functions provides configurable propensity score trimming:-
NULL(default): Absolute boundsc(0.01, 0.99) -
"none": No trimming -
"quantile": Quantile-based trimming -
"absolute": Explicit absolute bounds - Numeric vector
c(lower, upper)for custom bounds - List with
methodandboundsfor full control
-
Influence Function Standard Errors
- Exposed influence function SE via
se_method = "influence"for: - Requires
cross_fit = TRUEfor valid inference.
Bug Fixes
-
Fixed bootstrap confidence interval coverage for all estimators. Bootstrap now correctly preserves user-specified model formulas (e.g., with quadratic terms like
I(X^2)) when refitting models during resampling. Previously, bootstrap usedY ~ .which only fit main effects, causing severely undercovered confidence intervals when models included non-linear terms.
Documentation
- Renamed “Traditional” mode to “Factual” mode throughout documentation and print output. This better reflects the distinction between factual (observed) outcomes and counterfactual outcomes under hypothetical interventions.
- Added
boot_ci_typeparameter documentation for bootstrap CI method selection. - Added comprehensive simulation study script for benchmarking.
- Added benchmark tests verifying estimators against standard implementations (WeightedROC, pROC).
cfperformance 0.4.0
Adds sensitivity, specificity, and ROC curve functions for both counterfactual and transportability settings.
New Features
Counterfactual Sensitivity/Specificity
-
cf_sensitivity()- Counterfactual sensitivity (true positive rate) -
cf_specificity()- Counterfactual specificity (true negative rate) -
cf_fpr()- Counterfactual false positive rate (1 - specificity) -
cf_tpr()- Alias forcf_sensitivity() -
cf_tnr()- Alias forcf_specificity() - Supports CL, IPW, DR, and naive estimators
- Vectorized threshold parameter for efficient ROC curve computation
Transportable Sensitivity/Specificity
-
tr_sensitivity()- Transportable sensitivity for target population -
tr_specificity()- Transportable specificity for target population -
tr_fpr()- Transportable false positive rate -
tr_tpr()- Alias fortr_sensitivity() -
tr_tnr()- Alias fortr_specificity() - Supports OM, IPW, DR, and naive estimators
- Works with both “transport” and “joint” analysis types
ROC Curves
-
tr_roc()- Compute transportable ROC curve in target population -
cf_roc()- Compute counterfactual ROC curve -
plot.tr_roc()/plot.cf_roc()- Plot ROC curves with AUC in legend -
as.data.frame.tr_roc()/as.data.frame.cf_roc()- Convert to data frame for ggplot2 - AUC computed via trapezoidal integration
- Option to include naive ROC curve for comparison
cfperformance 0.3.0
Adds machine learning integration for flexible nuisance model estimation with automatic cross-fitting for valid inference.
New Features
ML Learner Interface
-
ml_learner()- Specify ML methods for propensity score and outcome models - Supports:
ranger,xgboost,grf,glmnet,superlearner, andcustom - Automatic cross-fitting when
ml_learnerspecs are detected - Seamlessly integrates with existing
propensity_model/outcome_modelarguments
Supported Learners
- ranger - Fast random forest implementation
- xgboost - Gradient boosting (XGBoost)
- grf - Generalized random forests with honest estimation
- glmnet - Elastic net regularization with CV-selected λ
- superlearner - Ensemble learning
- custom - User-supplied fit/predict functions
Usage Example
# MSE with ML learners
cf_mse(
predictions = pred, outcomes = y, treatment = a, covariates = df,
propensity_model = ml_learner("ranger", num.trees = 500),
outcome_model = ml_learner("xgboost", nrounds = 100),
cross_fit = TRUE
)
# AUC with ML learners
cf_auc(
predictions = pred, outcomes = y, treatment = a, covariates = df,
propensity_model = ml_learner("ranger", num.trees = 500),
outcome_model = ml_learner("ranger", num.trees = 500),
cross_fit = TRUE
)References
Chernozhukov, V., et al. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68.
Li, B., Gatsonis, C., Dahabreh, I. J., & Steingrimsson, J. A. (2022). Estimating the area under the ROC curve when transporting a prediction model to a target population. Biometrics, 79(3), 2343-2356.
cfperformance 0.2.0
Major release adding transportability estimators from Voter et al. (2025) for evaluating prediction model performance when transporting from a source population (e.g., RCT) to a target population.
New Features
Transportability Functions
-
tr_mse()- Transportable MSE estimation with naive, om, ipw, dr estimators -
tr_auc()- Transportable AUC estimation with all estimators -
tr_calibration()- Transportable calibration curves with ICI, E50, E90, Emax
Analysis Modes
-
Transport analysis (
analysis = "transport"): Use source/RCT outcomes to estimate performance in target population -
Joint analysis (
analysis = "joint"): Pool source and target data for potentially more efficient estimation
Inference
- Bootstrap standard errors with stratified sampling option (
stratified_boot = TRUE) to preserve source/target ratio - Influence function-based standard errors for
tr_mse()(all estimators)
S3 Methods for tr_* Functions
-
print.tr_performance()- Print method for transportability results -
summary.tr_performance()- Detailed summary -
coef.tr_performance()- Extract point estimates -
confint.tr_performance()- Confidence intervals -
plot.tr_calibration()- Calibration curve visualization
References
Voter SR, et al. Transportability of machine learning-based counterfactual prediction models with application to CASS. Diagnostic and Prognostic Research. 2025; 9(4). doi:10.1186/s41512-025-00201-y
cfperformance 0.1.0
Initial release implementing methods from Boyer, Dahabreh & Steingrimsson (2025), “Estimating and evaluating counterfactual prediction models.”
Features
Performance Metrics
-
cf_mse()- Counterfactual MSE/Brier score estimation -
cf_auc()- Counterfactual AUC estimation -
cf_calibration()- Counterfactual calibration curves with ICI, E50, E90, Emax
Estimators
- Naive estimator (subset-based)
- Conditional Loss (CL) / Outcome modeling estimator
- Inverse Probability Weighting (IPW) estimator
- Doubly Robust (DR) estimator
Inference
- Bootstrap standard errors with parallel support
- Influence function-based analytic standard errors
- Cross-fitting (sample splitting) for DR estimation
- Confidence intervals via normal approximation or percentile bootstrap
Model Selection
-
cf_cv()- K-fold cross-validation with counterfactual metrics -
cf_compare()- Compare multiple prediction models
References
Boyer CB, Dahabreh IJ, Steingrimsson JA. Counterfactual prediction model performance. Statistics in Medicine. 2025; 44(23-24):e70287. doi:10.1002/sim.70287
