Various parameters that control aspects of the Cubist fit.

Most of these values are discussed at length in http://rulequest.com/cubist-unix.html

cubistControl(
  unbiased = FALSE,
  rules = 100,
  extrapolation = 100,
  sample = 0,
  seed = sample.int(4096, size = 1) - 1L,
  label = "outcome",
  strip_time_stamps = TRUE
)

Arguments

unbiased: a logical: should unbiased rules be used?
rules: an integer (or NA): define an explicit limit to the number of rules used (NA let's Cubist decide).
extrapolation: a number between 0 and 100: since Cubist uses linear models, predictions can be outside of the outside of the range seen the training set. This parameter controls how much rule predictions are adjusted to be consistent with the training set.
sample: a number between 0 and 99.9: this is the percentage of the data set to be randomly selected for model building (not for out-of-bag type evaluation).
seed: an integer for the random seed (in the C code)
label: a label for the outcome (when printing rules)
strip_time_stamps: a logical: should timestamps and timing information be removed from the output? Defaults to TRUE for reproducible output. Set to FALSE to include the Cubist version header and execution time.

Value

A list containing the options.

References

Quinlan. Learning with continuous classes. Proceedings of the 5th Australian Joint Conference On Artificial Intelligence (1992) pp. 343-348

Quinlan. Combining instance-based and model-based learning. Proceedings of the Tenth International Conference on Machine Learning (1993) pp. 236-243

Quinlan. C4.5: Programs For Machine Learning (1993) Morgan Kaufmann Publishers Inc. San Francisco, CA

http://rulequest.com/cubist-info.html

Author

Max Kuhn

Examples


cubistControl()
#> $unbiased
#> [1] FALSE
#> 
#> $rules
#> [1] 100
#> 
#> $extrapolation
#> [1] 1
#> 
#> $sample
#> [1] 0
#> 
#> $label
#> [1] "outcome"
#> 
#> $seed
#> [1] 3950
#> 
#> $strip_time_stamps
#> [1] TRUE
#>