Title: | Interactive Visualization Tool for Random Forests |
---|---|
Description: | An interactive data visualization and exploration toolkit that implements Breiman and Cutler's original random forest Java based visualization tools in R, for supervised and unsupervised classification and regression within the algorithm random forest. |
Authors: | Chris Kuchar [aut, cre] |
Maintainer: | Chris Kuchar <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2025-02-23 02:45:26 UTC |
Source: | https://github.com/chriskuchar/rfviz |
Rfviz is an interactive package and toolkit in R, using TclTK code on the backend, to help in viewing and interpreting the results Random Forests for both Supervised Classification and Regression in a user-friendly way.
Currently, rfviz implements the following statistical graphs, with functions to view any combination of the plots:
The three plots are:
1. The classic multidimensionally scaled proximities are plotted as a 3-D XYZ scatterplot.
2. The raw input data is plotted in a parallel coordinate plot.
3. The local importance scores of each observation are plotted in a parallel coordinate plot.
rfviz is built using the package Loon on the backend, and implements the random forests algorithm.
For detailed instructions in the use of these plots in this package, visit https://github.com/chriskuchar/rfviz/blob/master/Rfviz.md
For instructions on how to use randomForests, use ?randomForest. For more information on loon, use ?loon.
Chris Kuchar [email protected], based on original Java graphics by Leo Breiman and Adele Cutler.
Liaw A, Wiener M (2002). “Classification and Regression by randomForest.” _R News_, *2*(3), 18-22. https://CRAN.R-project.org/doc/Rnews/
Waddell A, Oldford R. Wayne (2018). "loon: Interactive Statistical Data Visualization" https://github.com/waddella/loon
Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.
Breiman, L (2002), “Manual On Setting Up, Using, And Understanding Random Forests V3.1”, https://www.stat.berkeley.edu/~breiman/Using_random_forests_V3.1.pdf
Breiman, L., Cutler, A., Random Forests Graphics. https://www.stat.berkeley.edu/~breiman/RandomForests/cc_graphics.htm
randomForest
, rf_prep
, rf_viz
, l_plot3D
, l_serialaxes
A dataset containing 6 types of glass; defined in terms of their oxide content
glass
glass
A data frame with 214 rows and 10 variables:
Refractive Index
Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10)
Magnesium
Aluminum
Silicon
Potassium
Calcium
Barium
Iron
Class Attribute
https://archive.ics.uci.edu/ml/datasets/glass+identification
A function using Random Forests which outputs a list of the Random Forests output, the predictor variables data, and response variable data.
rf_prep(x, y = NULL, ...)
rf_prep(x, y = NULL, ...)
x |
A data frame or a matrix of predictors. |
y |
A response vector. If a factor, classification is assume, otherwise regression is assumed. If omitted, randomForest will run in unsupervised mode. |
... |
Optional parameters to be passed down to the randomForest function. Use ?randomForest to see the optional parameters. |
The parallel coordinate plots of the input data, the local importance scores, and the 3-D XYZ classic multidimensional scaling proximities from the output of the random forest algorithm.
For instructions on how to use randomForests, use ?randomForest. For more information on loon, use ?loon.
For detailed instructions in the use of these plots in this package, visit https://github.com/chriskuchar/rfviz/blob/master/Rfviz.md
Chris Kuchar [email protected], based on original Java graphics by Leo Breiman and Adele Cutler.
Liaw A, Wiener M (2002). “Classification and Regression by randomForest.” _R News_, *2*(3), 18-22. https://CRAN.R-project.org/doc/Rnews/
Waddell A, Oldford R. Wayne (2018). "loon: Interactive Statistical Data Visualization" https://github.com/waddella/loon
Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.
Breiman, L (2002), “Manual On Setting Up, Using, And Understanding Random Forests V3.1”, https://www.stat.berkeley.edu/~breiman/Using_random_forests_V3.1.pdf
Breiman, L., Cutler, A., Random Forests Graphics. https://www.stat.berkeley.edu/~breiman/RandomForests/cc_graphics.htm
randomForest
, rf_viz
, l_plot3D
, l_serialaxes
#Preparation for classification with Iris data set rfprep <- rf_prep(x=iris[,1:4], y=iris$Species) #Preparation for regression with mtcars data set rfprep <- rf_prep(x=mtcars[,-1], y=mtcars$mpg) #Preparation for the unsupervised case with Iris data set rfprep <- rf_prep(x=iris[,1:4], y=NULL)
#Preparation for classification with Iris data set rfprep <- rf_prep(x=iris[,1:4], y=iris$Species) #Preparation for regression with mtcars data set rfprep <- rf_prep(x=mtcars[,-1], y=mtcars$mpg) #Preparation for the unsupervised case with Iris data set rfprep <- rf_prep(x=iris[,1:4], y=NULL)
The Input Data, Local Importance Scores, and Classic Multidimensional Scaling Plots
rf_viz(rfprep, input = TRUE, imp = TRUE, cmd = TRUE, hl_color = "orange")
rf_viz(rfprep, input = TRUE, imp = TRUE, cmd = TRUE, hl_color = "orange")
rfprep |
A list of prepared Random Forests input data to be used in visualization, created using the function rf_prep. |
input |
Should the Input Data Parallel Coordinate Plot be included in the visualization? |
imp |
Should the Local Importance Scores Parallel Coordinate Plot be included in the visualization? |
cmd |
Should the Classic Multidimensional Scaling Proximites 3-D XYZ Scatter Plot be included in the visualization? |
hl_color |
The highlight color when you select points on the plot(s). |
Any combination of the parallel coordinate plots of the input data, the local importance scores, and the 3-D XYZ classic multidimensional scaling proximities from the output of the random forest algorithm.
For instructions on how to use randomForests, use ?randomForest. For more information on loon, use ?loon.
For detailed instructions in the use of these plots in this package, visit https://github.com/chriskuchar/rfviz/blob/master/Rfviz.md
Chris Kuchar [email protected], based on original Java graphics by Leo Breiman and Adele Cutler.
Liaw A, Wiener M (2002). “Classification and Regression by randomForest.” _R News_, *2*(3), 18-22. https://CRAN.R-project.org/doc/Rnews/
Waddell A, Oldford R. Wayne (2018). "loon: Interactive Statistical Data Visualization" https://github.com/waddella/loon
Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.
Breiman, L (2002), “Manual On Setting Up, Using, And Understanding Random Forests V3.1”, https://www.stat.berkeley.edu/~breiman/Using_random_forests_V3.1.pdf
Breiman, L., Cutler, A., Random Forests Graphics. https://www.stat.berkeley.edu/~breiman/RandomForests/cc_graphics.htm
randomForest
, rf_prep
, l_plot3D
, l_serialaxes
#Classification with iris data set rfprep <- rf_prep(x = iris[,1:4], y = iris$Species) #View all three plots Myrfplots <- rf_viz(rfprep, input = TRUE, imp = TRUE, cmd = TRUE, hl_color = 'orange') #Select data on any of the plots then run: iris[Myrfplots$input['selected'], ] iris[Myrfplots$imp['selected'], ] iris[Myrfplots$cmd['selected'], ] #Rotate 3-D XYZ Scatterplot #1. Click on 3-D XYZ Scatterplot #2. Press 'r' on keyboard to enter rotation mode #3. Click and drag mouse to rotate plot #4. Press 'r' to leave rotation mode #View only the Input Data and CMD Scaling Proximities Plots Myrfplots <- rf_viz(rfprep, input = TRUE, imp = FALSE, cmd = TRUE, hl_color = 'orange') #Regression with mtcars data set rfprep2 <- rf_prep(x = mtcars[,-1], y = mtcars$mpg) #View all three plots Myrfplots <- rf_viz(rfprep2, input = TRUE, imp = TRUE, cmd = TRUE, hl_color = 'orange') #Unsupervised clustering with iris data set rfprep <- rf_prep(x = iris[,1:4], y = NULL) #View the Input Data and CMD Scaling Proximities Plots for the unsupervised case. #(Importance Scores Plot not valid here) Myrfplots <- rf_viz(rfprep, input = TRUE, imp = FALSE, cmd = TRUE, hl_color = 'orange')
#Classification with iris data set rfprep <- rf_prep(x = iris[,1:4], y = iris$Species) #View all three plots Myrfplots <- rf_viz(rfprep, input = TRUE, imp = TRUE, cmd = TRUE, hl_color = 'orange') #Select data on any of the plots then run: iris[Myrfplots$input['selected'], ] iris[Myrfplots$imp['selected'], ] iris[Myrfplots$cmd['selected'], ] #Rotate 3-D XYZ Scatterplot #1. Click on 3-D XYZ Scatterplot #2. Press 'r' on keyboard to enter rotation mode #3. Click and drag mouse to rotate plot #4. Press 'r' to leave rotation mode #View only the Input Data and CMD Scaling Proximities Plots Myrfplots <- rf_viz(rfprep, input = TRUE, imp = FALSE, cmd = TRUE, hl_color = 'orange') #Regression with mtcars data set rfprep2 <- rf_prep(x = mtcars[,-1], y = mtcars$mpg) #View all three plots Myrfplots <- rf_viz(rfprep2, input = TRUE, imp = TRUE, cmd = TRUE, hl_color = 'orange') #Unsupervised clustering with iris data set rfprep <- rf_prep(x = iris[,1:4], y = NULL) #View the Input Data and CMD Scaling Proximities Plots for the unsupervised case. #(Importance Scores Plot not valid here) Myrfplots <- rf_viz(rfprep, input = TRUE, imp = FALSE, cmd = TRUE, hl_color = 'orange')