After a long campaign of data analysis, one model was able to conquer the rest: Naive Bayes!
With 1,514 training set samples and 56 predictors, our amazing model has an area under the ROC curve of 0.86!
What could go wrong?
Most predictions are zero or one?
In a lot of cases, we are confidently incorrect.
This seems⦠bad.
The model is able to separate the classes but the probabilities are not realistic.
They arenāt well-calibrated.
set.seed(8928)
split <- initial_split(all_data, strata = class)
data_tr <- training(split)
data_te <- testing(split)
data_rs <- vfold_cv(data_tr, strata = class)
bayes_wflow <-
workflow() %>%
add_formula(class ~ .) %>%
add_model(naive_Bayes())
cls_met <- metric_set(roc_auc, brier_class)
ctrl <- control_resamples(save_pred = TRUE)
# The resampling results from 10-fold cross-validation:
bayes_res <-
bayes_wflow %>%
fit_resamples(data_rs, metrics = cls_met, control = ctrl)
probably
has functions for post-processing model results, including:
(and in the most recent version)
Weāll look at the calibration tools today. There are several tools for assessing calibration
If we donāt have a model with better separation and calibration, we can post-process the predictions.
These models can estimate the trends and āun-borkā the predictions.
Ideally, we would reserve some data to estimate the mis-calibration patterns.
If not, we could use the holdout predictions from resampling (or a validation set). This is a little risky but doable.
The Brier Score is a nice performance metric that can measure effectiveness and calibration.
For 2 classes:
cal_validate_beta(bayes_res, metrics = cls_met) %>%
collect_metrics() %>%
arrange(.metric)
#> # A tibble: 4 Ć 7
#> .metric .type .estimator mean n std_err .config
#> <chr> <chr> <chr> <dbl> <int> <dbl> <chr>
#> 1 brier_class uncalibrated binary 0.201 10 0.0102 config
#> 2 brier_class calibrated binary 0.145 10 0.00450 config
#> 3 roc_auc uncalibrated binary 0.857 10 0.00945 config
#> 4 roc_auc calibrated binary 0.857 10 0.00942 config
beta_cal <- cal_estimate_beta(bayes_res)
We will be updating workflow objects with post-processors towards the end of the year.
This means that we can:
predict(workflow, new_data)
.Again, Edgar Ruiz did the majority of the work on calibration methods!