Title: | Prediction Model Tools |
---|---|
Description: | Provides additional functions for evaluating predictive models, including plotting calibration curves and model-based Receiver Operating Characteristic (mROC) based on Sadatsafavi et al (2021) <arXiv:2003.00316>. |
Authors: | Mohsen Sadatsafavi [aut, cph] , Amin Adibi [cre] , Abdollah Safari [aut], Tae Yoon Lee [aut] |
Maintainer: | Amin Adibi <[email protected]> |
License: | GPL |
Version: | 0.0.3 |
Built: | 2024-10-27 04:55:28 UTC |
Source: | https://github.com/resplab/predtools |
Calculates the absolute surface between the empirical and expected ROCs
calc_mROC_stats(y, p, ordered = FALSE, fast = TRUE)
calc_mROC_stats(y, p, ordered = FALSE, fast = TRUE)
y |
y vector of binary responses |
p |
p vector of predicted probabilities (same length as y) |
ordered |
defaults to false |
fast |
defaults to true |
Returns a list with the A (mean calibration statistic) and B (mROC/ROC equality statistic) as well as the direction of poential miscalibration (sign of the difference between the ctual and predicted mean risk)
Calculates the first two moments of the bivariate distribution of NB_model and NB_all
calc_NB_moments(Y, pi, z, weights = NULL)
calc_NB_moments(Y, pi, z, weights = NULL)
Y |
Vector of the binary response variable |
pi |
Vector of predicted risks |
z |
Decision threshold at which the NBs are calculated |
weights |
Optinal - observation weights |
Two means, two SDs, and one correlation coefficient. First element is for the model and second is for treating all
Title Create calibration plot based on observed and predicted outcomes.
calibration_plot( data, obs, follow_up = NULL, pred, group = NULL, nTiles = 10, legendPosition = "right", title = NULL, x_lim = NULL, y_lim = NULL, xlab = "Prediction", ylab = "Observation", points_col_list = NULL, data_summary = FALSE )
calibration_plot( data, obs, follow_up = NULL, pred, group = NULL, nTiles = 10, legendPosition = "right", title = NULL, x_lim = NULL, y_lim = NULL, xlab = "Prediction", ylab = "Observation", points_col_list = NULL, data_summary = FALSE )
data |
Data include observed and predicted outcomes. |
obs |
Name of observed outcome in the input data. |
follow_up |
Name of follow-up time (if applicable) in the input data. |
pred |
Name of first predicted outcome in the input data. |
group |
Name of grouping column (if applicable) in the input data. |
nTiles |
Number of tiles (e.g., 10 for deciles) in the calibration plot. |
legendPosition |
Legend position on the calibration plot. |
title |
Title on the calibration plot. |
x_lim |
Limits of x-axis on the calibration plot. |
y_lim |
Limits of y-axis on the calibration plot. |
xlab |
Label of x-axis on the calibration plot. |
ylab |
Label of y-axis on the calibration plot. |
points_col_list |
Points' color on the calibration plot. |
data_summary |
Logical indicates whether a summary of the predicted and observed outcomes. needs to be included in the output. |
Returns calibration plot (a ggplot object) and a dataset including summary statistics of the predicted and observed outcomes (if data_summary set to be TRUE).
library(predtools) library(dplyr) x <- rnorm(100, 10, 2) y <- x + rnorm(100,0, 1) data <- data.frame(x, y) calibration_plot(data, obs = "x", pred = "y")
library(predtools) library(dplyr) x <- rnorm(100, 10, 2) y <- x + rnorm(100,0, 1) data <- data.frame(x, y) calibration_plot(data, obs = "x", pred = "y")
A dataset containing sample model development data
A data frame with 500 rows and 5 variables:
ageage
severitywhether or not the disease was severe
sexbinary sex variable, 1 for female and 0 for male
comorbiditywhether or not comorbidities are present
yresponse variable
Simulated
EVPI (Expected Value of Perfect Information) for validation Takes a vector of mean and a 2X2 covariance matrix
evpi_val( Y, pi, method = c("bootstrap", "bayesian_bootstrap", "asymptotic"), n_sim = 1000, zs = (0:99)/100, weights = NULL )
evpi_val( Y, pi, method = c("bootstrap", "bayesian_bootstrap", "asymptotic"), n_sim = 1000, zs = (0:99)/100, weights = NULL )
Y |
Binary response variable |
pi |
Mean of the second distribution |
method |
EVPI calculation method |
n_sim |
Number of Monte Carlo simulations (for bootstrap-based methods) |
zs |
vector of risk thresholds at which EVPI is to be calculated |
weights |
(optional) observation weights |
Returns a data frame containing thresholds, EVPIs, and some auxilary output.
A dataset containing anonymized data from the gusto trial
A data frame with 40830 rows and 29 variables:
day30whether death happened by day 30 after intervention
showhether cardiac shock was present
higwhether the patient hat high blood pressure
diawhether the patient had diabetes
hrtwhether the patient was on hormone replacement therapies
Internet
Takes in a mROC object and calculates the area under the curve
mAUC(mROC_obj)
mAUC(mROC_obj)
mROC_obj |
An object of class mROC |
Returns the area under the mROC curve
Calculates mROC from the vector of predicted risks Takes in a vector of probabilities and returns mROC values (True positives, False Positives in an object of class mROC)
mROC(p, ordered = FALSE)
mROC(p, ordered = FALSE)
p |
A numeric vector of probabilities. |
ordered |
Optional, if the vector p is ordered from small to large (if not the function will do it; TRUE is to facilitate fast computations). |
This function returns an object of class mROC. It has three vectors: thresholds on predicted risks (which is the ordered vector of input probabilities), false positive rates (FPs), and true positive rates (TPs). You can directly call the plot function on this object to draw the mROC
Main eROC analysis that plots ROC and eROC
mROC_analysis(y, p, inference = 0, n_sim, fast = TRUE)
mROC_analysis(y, p, inference = 0, n_sim, fast = TRUE)
y |
y vector of observed responses. |
p |
p vector of predicted probabilities (the same length as observed responses) |
inference |
0 for no inference, 1 for p-value only, and 2 for p-value and 95 percent CI. |
n_sim |
number of simulations |
fast |
defaults to true |
returns a list containing the results of mROC analysis.
Statistical inference for comparing empirical and expected ROCs. If CI=TRUE then also returns pointwise CIs
mROC_inference(y, p, n_sim = 1e+05, CI = FALSE, aux = FALSE, fast = TRUE)
mROC_inference(y, p, n_sim = 1e+05, CI = FALSE, aux = FALSE, fast = TRUE)
y |
vector of binary response values |
p |
vector of probabilities |
n_sim |
number of Monte Carlo simulations to calculate p-value |
CI |
optional. Whether confidence interval should be calculated for each point of mROC. Default is FALSE. |
aux |
aux optional. whether additional results (component-wise p-values etc) should be written in the package's aux variable. Default is FALSE. |
fast |
fast optional. Whether the fast code (C++) or slow code (R) should be called. Default is TRUE (R code will be slow unless the dataset is small) |
Returns an object of type mROC_inference containing the results of statistical inference for the mROC curve
Calculates the expected value of the maximum of two random variables with zero-truncated bivariate normal distribution Takes a vector of mean and a 2X2 covariance matrix
mu_max_trunc_bvn( mu1, mu2, sigma1, sigma2, rho, precision = .Machine$double.eps )
mu_max_trunc_bvn( mu1, mu2, sigma1, sigma2, rho, precision = .Machine$double.eps )
mu1 |
Mean of the first distribution |
mu2 |
Mean of the second distribution |
sigma1 |
SD of the first distribution |
sigma2 |
SD of the second distribution |
rho |
Correlation coefficient of the two random variables |
precision |
Numerical precision value |
A scalar value for the expected value
Title Update a prediction model for a binary outcome by multiplying a fixed odd-ratio to the predicted odds.
odds_adjust(p0, p1, v)
odds_adjust(p0, p1, v)
p0 |
Mean of observed risk or predicted risk in development sample. |
p1 |
Mean of observed risk in target population. |
v |
Variance of predicted risk in development sample. |
Returns a correction factor that can be applied to the predicted odds in order to update the predictions for a new target population.
Title Estimate mean and variance of prediction based on model calibration output.
pred_summary_stat(calibVector)
pred_summary_stat(calibVector)
calibVector |
Vector of predicted probability of risk per decile or percentile (e.g., from a calibration plot). |
Returns mean and variance of predictions based on the predicted probabilities.
A dataset containing sample model validation data
A data frame with 400 rows and 5 variables:
ageage of the patient
severitywhether or not the disease was severe
sexbinary sex variable, 1 for female and 0 for male
comorbiditywhether or not comorbidities are present
yresponse variable
Simulated