Prediction using the parametric model are calculated using the
method of Quinlan (1992). If neighbors
is greater than zero,
these predictions are adjusted by training set instances nearby
using the approach of Quinlan (1993).
# S3 method for cubist
predict(object, newdata = NULL, neighbors = 0, ...)
an object of class cubist
a data frame of predictors (in the same order as the original training data). Must have column names.
an integer from 0 to 9: how many instances to use to correct the rule-based prediction?
other options to pass through the function (not currently used)
a numeric vector is returned
Note that the predictions can fail for various reasons. For example, as shown in the examples, if the model uses a qualitative predictor and the prediction data has a new level of that predictor, the function will throw an error.
Quinlan. Learning with continuous classes. Proceedings of the 5th Australian Joint Conference On Artificial Intelligence (1992) pp. 343-348
Quinlan. Combining instance-based and model-based learning. Proceedings of the Tenth International Conference on Machine Learning (1993) pp. 236-243
Quinlan. C4.5: Programs For Machine Learning (1993) Morgan Kaufmann Publishers Inc. San Francisco, CA
cubist()
, cubistControl()
, summary.cubist()
,
predict.cubist()
, dotplot.cubist()
library(mlbench)
data(BostonHousing)
## 1 committee and no instance-based correction, so just an M5 fit:
mod1 <- cubist(x = BostonHousing[, -14], y = BostonHousing$medv)
predict(mod1, BostonHousing[1:4, -14])
#> [1] 29.00493 22.19729 34.17934 32.97410
## now add instances
predict(mod1, BostonHousing[1:4, -14], neighbors = 5)
#> [1] 26.07685 21.67079 33.87119 34.64103
# Example error
iris_test <- iris
iris_test$Species <- as.character(iris_test$Species)
mod <- cubist(x = iris_test[1:99, 2:5],
y = iris_test$Sepal.Length[1:99])
# predict(mod, iris_test[100:151, 2:5])
# Error:
# *** line 2 of `undefined.cases':
# bad value of 'virginica' for attribute 'Species'