Most of these values are discussed at length in http://rulequest.com/cubist-unix.html
cubistControl(
unbiased = FALSE,
rules = 100,
extrapolation = 100,
sample = 0,
seed = sample.int(4096, size = 1) - 1L,
label = "outcome"
)
a logical: should unbiased rules be used?
an integer (or NA
): define an explicit limit to
the number of rules used (NA
let's Cubist decide).
a number between 0 and 100: since Cubist uses linear models, predictions can be outside of the outside of the range seen the training set. This parameter controls how much rule predictions are adjusted to be consistent with the training set.
a number between 0 and 99.9: this is the percentage of the data set to be randomly selected for model building (not for out-of-bag type evaluation).
an integer for the random seed (in the C code)
a label for the outcome (when printing rules)
A list containing the options.
Quinlan. Learning with continuous classes. Proceedings of the 5th Australian Joint Conference On Artificial Intelligence (1992) pp. 343-348
Quinlan. Combining instance-based and model-based learning. Proceedings of the Tenth International Conference on Machine Learning (1993) pp. 236-243
Quinlan. C4.5: Programs For Machine Learning (1993) Morgan Kaufmann Publishers Inc. San Francisco, CA
cubistControl()
#> $unbiased
#> [1] FALSE
#>
#> $rules
#> [1] 100
#>
#> $extrapolation
#> [1] 1
#>
#> $sample
#> [1] 0
#>
#> $label
#> [1] "outcome"
#>
#> $seed
#> [1] 2507
#>