8 Models Clustered by Tag Similarity

This page shows a network diagram of all the models that can be accessed by train. See the Revolutions blog for details about how this visualization was made (and this page has updated code using the networkD3 package). In summary, the package annotates each model by a set of tags (e.g. “Bagging”, “L1 Regularization” etc.). Using this information we can cluster models that are similar to each other.

Green circles are models only used for regression, blue is classification only and orange is “dual use”. Hover over a circle to get the model name and the model code used by the caret package and refreshing the screen will re-configure the layout. You may need to move a node to the left to see the whole name. 43 models without connections are not shown in the graph.

The data used to create this graph can be found here.

The plot below shows the similarity matrix. Hover over a cell to see the pair of models and their Jaccard similarity. Darker colors indicate similar models.