# 16 Miscellaneous Model Functions

Contents

## 16.1 Yet Another k-Nearest Neighbor Function

knn3 is a function for k-nearest neighbor classification. This particular implementation is a modification of the knn C code and returns the vote information for all of the classes ( knn only returns the probability for the winning class). There is a formula interface via

knn3(formula, data)
## or by passing the training data directly
## x is a matrix or data frame, y is a factor vector
knn3(x, y)

There are also print and predict methods.

For the Sonar data in the mlbench package, we can fit an 11-nearest neighbor model:

library(caret)
library(mlbench)
data(Sonar)
set.seed(808)
## function(object, x) predict(object, x)$posterior ### 16.4.3 The aggregate Function This should be a function that takes the predictions from the constituent models and converts them to a single prediction per sample. Inputs: • x: a list of objects returned by the pred module. • type: an optional string that describes the type of output (e.g. “class”, “prob” etc.). The output is either a number vector (for regression), a factor (or character) vector for classification or a matrix/data frame of class probabilities. For the linear discriminant model above, we saved the matrix of class probabilities. To average them and generate a class prediction, we could use: function(x, type = "class") { ## The class probabilities come in as a list of matrices ## For each class, we can pool them then average over them ## Pre-allocate space for the results pooled <- x[] * NA n <- nrow(pooled) classes <- colnames(pooled) ## For each class probability, take the median across ## all the bagged model predictions for(i in 1:ncol(pooled)) { tmp <- lapply(x, function(y, col) y[,col], col = i) tmp <- do.call("rbind", tmp) pooled[,i] <- apply(tmp, 2, median) } ## Re-normalize to make sure they add to 1 pooled <- apply(pooled, 1, function(x) x/sum(x)) if(n != nrow(pooled)) pooled <- t(pooled) if(type == "class") { out <- factor(classes[apply(pooled, 1, which.max)], levels = classes) } else out <- as.data.frame(pooled) out } For example, to bag a conditional inference tree (from the party package): library(caret) set.seed(998) inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,]
testing  <- Sonar[-inTraining,]
set.seed(825)
baggedCT <- bag(x = training[, names(training) != "Class"],
y = training$Class, B = 50, bagControl = bagControl(fit = ctreeBag$fit,
predict = ctreeBag$pred, aggregate = ctreeBag$aggregate))
summary(baggedCT)
##
## Call:
## bag.default(x = training[, names(training) != "Class"], y
##  = training$Class, B = 50, bagControl = bagControl(fit = ## ctreeBag$fit, predict = ctreeBag$pred, aggregate = ctreeBag$aggregate))
##
## Out of bag statistics (B = 50):
##
##        Accuracy    Kappa
##   0.0%   0.4746 -0.04335
##   2.5%   0.5806  0.17971
##  25.0%   0.6681  0.32402
##  50.0%   0.7094  0.41815
##  75.0%   0.7606  0.51092
##  97.5%   0.8060  0.59901
## 100.0%   0.8077  0.61078

## 16.5 Model Averaged Neural Networks

The avNNet fits multiple neural network models to the same data set and predicts using the average of the predictions coming from each constituent model. The models can be different either due to different random number seeds to initialize the network or by fitting the models on bootstrap samples of the original training set (i.e. bagging the neural network). For classification models, the class probabilities are averaged to produce the final class prediction (as opposed to voting from the individual class predictions.

As an example, the model can be fit via train:

set.seed(825)
avNnetFit <- train(x = training,
y = trainClass,
method = "avNNet",
repeats = 15,
trace = FALSE) 

## 16.6 Neural Networks with a Principal Component Step

Neural networks can be affected by severe amounts of multicollinearity in the predictors. The function pcaNNet is a wrapper around the preProcess and nnet functions that will run principal component analysis on the predictors before using them as inputs into a neural network. The function will keep enough components that will capture some pre-defined threshold on the cumulative proportion of variance (see the thresh argument). For new samples, the same transformation is applied to the new predictor values (based on the loadings from the training set). The function is available for both regression and classification.

This function is deprecated in favor of the train function using method = "nnet" and preProc = "pca".

## 16.7 Independent Component Regression

The icr function can be used to fit a model analogous to principal component regression (PCR), but using independent component analysis (ICA). The predictor data are centered and projected to the ICA components. These components are then regressed against the outcome. The user needed to specify the number of components to keep.

The model uses the preProcess function to compute the latent variables using the fastICA package.

Like PCR, there is no guarantee that there will be a correlation between the new latent variable and the outcomes.