50 Interactive 3D Visualization in R
Yaoyuan Zhang and Zimu An
50.1 Motivation
Although a 2D plot (or multiple of them) can contain enough information we need, a 3D plot can be more intuitive since the 3D space is where we reside. For example, we usually see the 2D projection of a map, or atlas, in textbooks or websites, but a globe is always more pleasant to use. In addition, when encoutering datasets with higher dimensions in statistics, a 3D visualization can help us with generalizing in higher dimensional spaces.
In this tutorial, we focus on interactive 3D visualization using packages including
plotly
and rgl
. We first cover the basic 3D scatter plots using the package rgl
,
then we try to add more useful tools for better visualization. We also introduce
how to produce interactive 3D surface graphs and 3D mesh plots in R
using the package plotly
We hope this tutorial can provide some insights and inspirations for people who wish to tackle high-dimensional data visualization problems.
50.2 3D visualization with rgl
package
The rgl
package is used to produce interactive 3-D plots. It contains
high-level graphics commands modeled loosely after classic R graphics,
but working in three dimensions. It also contains low level structure
inspired by (but incompatible with) the grid
package.
50.2.1 Preparing the Data
We’ll use the iris
data for this tutorial in the following examples.
iris
data set gives the measurements of the variables sepal length and
width, petal length and width, respectively, for 50 flowers from each of
3 species of iris. The species are Iris setosa, versicolor, and
virginica.
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
x <- sep.l <- iris$Sepal.Length
y <- pet.l <- iris$Petal.Length
z <- sep.w <- iris$Sepal.Width
50.2.2 Starting and Closing the RGL Device
In order to make a 3D plot with RGL, we need to first start the RGL device in R. The functions below are used to manage RGL device:
- rgl.open(): Opens a new device
- rgl.close(): Closes the current device
- rgl.clear(): Clears the current device
- rgl.cur(): Returns the active device ID
- rgl.quit(): Shutdowns the RGL device system
50.2.3 3D Scatter Plot
50.2.3.1 Custom function to initialize RGL device
Since the default RGL device usually needs initialization, we can create a custom function for easier initialization process. The following function rgl_init() will create a new RGL device if there is no opened device:
#' @param new.device a logical value. If TRUE, creates a new device
#' @param bg the background color of the device
#' @param width the width of the device
rgl_init <- function(new.device = FALSE, bg = "white", width = 640) {
if( new.device | rgl.cur() == 0 ) {
rgl.open()
par3d(windowRect = 50 + c( 0, 0, width, width ) )
rgl.bg(color = bg )
}
rgl.clear(type = c("shapes", "bboxdeco"))
rgl.viewpoint(theta = 15, phi = 20, zoom = 0.7)
}
The functions used in the custom function that were not introduced previously includes:
- par3d(windowRect): set the window size
-
rgl.viewpoint(theta, phi, fov, zoom): set up viewpoint. The
arguments theta and phi are polar coordinates.
- theta and phi are polar coordinates. Default values are 0 and 15, respectively
- fov is the field-of-view angle in degrees. Default value is 60
- zoom is the zoom factor. Default value is 1
- rgl.bg(color): define the background color of the device
50.2.3.2 Basic 3D Plot with spheres
rgl_init()
rgl.spheres(x, y, z, r = 0.1, color = "yellow") # Scatter plot
Note that rgl
package also provides functions such as rgl.phoints
to represent data points in another format. For the purpose of
conformity, this tutorial use spheres to represent data points.
50.2.3.3 Add a Bounding Box decoration
The function rgl.bbox() is used:
rgl_init()
rgl.spheres(x, y, z, r = 0.1, color = "yellow")
# Add bounding box decoration
rgl.bbox(color=c("#333377","black"), emission="#333377",
specular="#3333FF", shininess=5, alpha=0.8 )
- color: a vector of colors. The first color is used for the background color of the bounding box. The second color is used for the tick mark labels.
- emission, specular, shininess: properties for lighting calculation
- alpha: value specifying the color transparency. The value should be between 0.0 (fully transparent) and 1.0 (opaque)
50.2.3.4 Add axis lines and labels
- The function rgl.lines(x, y = NULL, z = NULL, …) can be used to add axis lines.
- The function rgl.texts(x, y = NULL, z = NULL, text) is used to add axis labels.
For the convenience of plotting, a custom function rgl_add_axes() is implemented for the purpose of adding x, y, and z axes.
# x, y, z : numeric vectors corresponding to
# the coordinates of points
# axis.col : axis colors
# xlab, ylab, zlab: axis labels
# show.plane : add axis planes
# show.bbox : add the bounding box decoration
# bbox.col: the bounding box colors. The first color is the
# the background color; the second color is the color of tick marks
rgl_add_axes <- function(x, y, z, axis.col = "grey",
xlab = "", ylab="", zlab="", show.plane = TRUE,
show.bbox = FALSE, bbox.col = c("#333377","black"))
{
lim <- function(x){c(-max(abs(x)), max(abs(x))) * 1.1}
# Add axes
xlim <- lim(x); ylim <- lim(y); zlim <- lim(z)
rgl.lines(xlim, c(0, 0), c(0, 0), color = axis.col)
rgl.lines(c(0, 0), ylim, c(0, 0), color = axis.col)
rgl.lines(c(0, 0), c(0, 0), zlim, color = axis.col)
# Add a point at the end of each axes to specify the direction
axes <- rbind(c(xlim[2], 0, 0), c(0, ylim[2], 0),
c(0, 0, zlim[2]))
rgl.points(axes, color = axis.col, size = 3)
# Add axis labels
rgl.texts(axes, text = c(xlab, ylab, zlab), color = axis.col,
adj = c(0.5, -0.8), size = 2)
# Add plane
if(show.plane)
xlim <- xlim/1.1; zlim <- zlim /1.1
rgl.quads( x = rep(xlim, each = 2), y = c(0, 0, 0, 0),
z = c(zlim[1], zlim[2], zlim[2], zlim[1]))
# Add bounding box decoration
if(show.bbox){
rgl.bbox(color=c(bbox.col[1],bbox.col[2]), alpha = 0.5,
emission=bbox.col[1], specular=bbox.col[1], shininess=5,
xlen = 3, ylen = 3, zlen = 3)
}
}
- rgl.quads(x, y, z) is used to add planes. x, y and z are numeric vectors of length four specifying the coordinates of the four nodes of the quad.
rgl_init()
rgl.spheres(x, y, z, r = 0.1, color = "yellow")
rgl_add_axes(x, y, z, show.bbox = TRUE)
# Show tick marks
axis3d('x', pos=c( NA, 0, 0 ), col = "darkgrey")
axis3d('y', pos=c( 0, NA, 0 ), col = "darkgrey")
axis3d('z', pos=c( 0, 0, NA ), col = "darkgrey")
The function axis3d() is used in the previous block to show the scales of axes.
50.2.3.5 Set aspect ratios of the axes
The function aspect3d(x, y = NULL, z = NULL) can be used to set the apparent ratios of the x, y and z axes for the current plot.
x, y and z are the ratio for x, y and z axes, respectively. x can be a vector of length 3, specifying the ratio for the 3 axes.
50.2.3.6 Change color of points by groups
A helper function can be used to select automatically a color for each group:
#' Get colors for the different levels of
#' a factor variable
#'
#' @param groups a factor variable containing the groups
#' of observations
#' @param colors a vector containing the names of
# the default colors to be used
get_colors <- function(groups, group.col = palette()){
groups <- as.factor(groups)
ngrps <- length(levels(groups))
if(ngrps > length(group.col))
group.col <- rep(group.col, ngrps)
color <- group.col[as.numeric(groups)]
names(color) <- as.vector(groups)
return(color)
}
Change colors by groups:
cols <- get_colors(iris$Species, c("lightblue", "palegreen3", "slateblue3"))
rgl_init()
rgl.spheres(x, y, z, r = 0.1, color = cols)
rgl_add_axes(x, y, z, show.bbox = TRUE)
aspect3d(1,1,1)
50.2.4 Additional Analysis tools
50.2.4.1 Ellipse of Concentration
The function ellipse3d() is used to estimate the ellipse of concentration:
- x: the correlation or covariance matrix between x, y and z
- scale: If x is a correlation matrix, then the standard deviations of each parameter can be given in the scale parameter. This defaults to c(1, 1, 1), so no rescaling will be done.
- centre: The center of the ellipse will be at this position.
- level: The confidence level of a confidence region. This is used to control the size of the ellipsoid.
The function ellipse3d() returns an object of class mesh3d which can be drawn using the function shade3d() and/or wired3d()
Draw the ellipse using the function shade3d():
rgl_init()
rgl.spheres(x, y, z, r = 0.12, color = "#D95F02")
rgl_add_axes(x, y, z, show.bbox = TRUE)
# Compute and draw the ellipse of concentration
ellips <- ellipse3d(cov(cbind(x,y,z)),
centre=c(mean(x), mean(y), mean(z)), level = 0.95)
shade3d(ellips, col = "#D95F02", alpha = 0.1, lit = FALSE)
aspect3d(1,1,1)
Draw the ellipse using the function wired3d():
rgl_init()
rgl.spheres(x, y, z, r = 0.2, color = "#D95F02")
rgl_add_axes(x, y, z, show.bbox = TRUE)
# Compute and draw the ellipse of concentration
ellips <- ellipse3d(cov(cbind(x,y,z)),
centre=c(mean(x), mean(y), mean(z)), level = 0.95)
wire3d(ellips, col = "#D95F02", lit = FALSE)
aspect3d(1,1,1)
We can also add ellipse for each group, as such:
# Groups
groups <- iris$Species
levs <- levels(groups)
group.col <- c("red", "green", "blue")
# Plot observations
rgl_init()
rgl.spheres(x, y, z, r = 0.1,
color = group.col[as.numeric(groups)])
rgl_add_axes(x, y, z, show.bbox = FALSE)
# Compute ellipse for each group
for (i in 1:length(levs)) {
group <- levs[i]
selected <- groups == group
xx <- x[selected]; yy <- y[selected]; zz <- z[selected]
ellips <- ellipse3d(cov(cbind(xx,yy,zz)),
centre=c(mean(xx), mean(yy), mean(zz)), level = 0.95)
shade3d(ellips, col = group.col[i], alpha = 0.1, lit = FALSE)
# show group labels
texts3d(mean(xx),mean(yy), mean(zz), text = group,
col= group.col[i], cex = 1)
}
aspect3d(1,1,1)
50.2.4.2 Regression plane
To add a custom regression plane, we can take the following approach:
- Use the function lm() to compute a linear regression model.
- Use the argument rgl.surface() to add a regression surface.
rgl_init()
rgl.spheres(x, y, z, r = 0.1, color = "#D95F02")
rgl_add_axes(x, y, z, show.bbox = FALSE)
aspect3d(1,1,1)
# Compute the linear regression (y = ax + bz + d)
fit <- lm(y ~ x + z)
# predict values on regular xz grid
grid.lines = 26
x.pred <- seq(min(x), max(x), length.out = grid.lines)
z.pred <- seq(min(z), max(z), length.out = grid.lines)
xz <- expand.grid( x = x.pred, z = z.pred)
y.pred <- matrix(predict(fit, newdata = xz),
nrow = grid.lines, ncol = grid.lines)
# Add regression surface
rgl.surface(x.pred, z.pred, y.pred, color = "steelblue",
alpha = 0.5, lit = FALSE)
# Add grid lines
rgl.surface(x.pred, z.pred, y.pred, color = "black",
alpha = 0.5, lit = FALSE, front = "lines", back = "lines")
50.3 Interactive 3D surface plots with package plotly
50.3.1 3D Surface Plot
In this section, we will introduce how to use plotly to generate interactive 3D surface plots. 3D suface plots are extremely useful and can convey/visualize a lot more information than a traditional contour plot, especially towards topological, geographical data in a straight forward fashion.
The dataset we are using here is the ‘volcano’ (Topographic Information on Auckland’s Maunga Whau Volcano) dataset. Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic field. This data set gives topographic information for Maunga Whau on a 10m by 10m grid. The dataset is a matrix with 87 rows and 61 columns, rows corresponding to grid lines running east to west and columns to grid lines running south to north. More importantly, since it’s a numeric matrix we could easily take it as an argument for method plot_ly which initializes the instance.
head(volcano)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,] 100 100 101 101 101 101 101 100 100 100 101 101 102 102
## [2,] 101 101 102 102 102 102 102 101 101 101 102 102 103 103
## [3,] 102 102 103 103 103 103 103 102 102 102 103 103 104 104
## [4,] 103 103 104 104 104 104 104 103 103 103 103 104 104 104
## [5,] 104 104 105 105 105 105 105 104 104 103 104 104 105 105
## [6,] 105 105 105 106 106 106 106 105 105 104 104 105 105 106
## [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
## [1,] 102 102 103 104 103 102 101 101 102 103 104 104
## [2,] 103 103 104 105 104 103 102 102 103 105 106 106
## [3,] 104 104 105 106 105 104 104 105 106 107 108 110
## [4,] 105 105 106 107 106 106 106 107 108 110 111 114
## [5,] 105 106 107 108 108 108 109 110 112 114 115 118
## [6,] 106 107 109 110 110 112 113 115 116 118 119 121
## [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38]
## [1,] 105 107 107 107 108 108 110 110 110 110 110 110
## [2,] 107 109 110 110 110 110 111 112 113 114 116 115
## [3,] 111 113 114 115 114 115 116 118 119 119 121 121
## [4,] 117 118 117 119 120 121 122 124 125 126 127 127
## [5,] 121 122 121 123 128 131 129 130 131 131 132 132
## [6,] 124 126 126 129 134 137 137 136 136 135 136 136
## [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48] [,49] [,50]
## [1,] 110 110 108 108 108 107 107 108 108 108 108 108
## [2,] 114 112 110 110 110 109 108 109 109 109 109 108
## [3,] 120 118 116 114 112 111 110 110 110 110 109 109
## [4,] 126 124 122 120 117 116 113 111 110 110 110 109
## [5,] 131 130 128 126 122 119 115 114 112 110 110 110
## [6,] 136 135 133 129 126 122 118 116 115 113 111 110
## [,51] [,52] [,53] [,54] [,55] [,56] [,57] [,58] [,59] [,60] [,61]
## [1,] 107 107 107 107 106 106 105 105 104 104 103
## [2,] 108 108 108 107 107 106 106 105 105 104 104
## [3,] 109 109 108 108 107 107 106 106 105 105 104
## [4,] 109 109 109 108 108 107 107 106 106 105 105
## [5,] 110 110 109 109 108 107 107 107 106 106 105
## [6,] 110 110 110 109 108 108 108 107 107 106 106
fig <- plot_ly(z = ~volcano)
fig <- fig %>% add_surface()
fig
Just few simple lines, we can create a highly interactive 3D plot that illustrates the 3D representation of the volcano dataset. The lighter the color, the higher the altitude. By rolling the mouse, we can zoom in and out easily. Pointing the mouse at every vertex on this 3D graph, the user can easily see the x, y, z coordinates while also the sectional view. From the above, we can also see the contour lines clearly displayed.
Moreover, we can configure the parameters in add_surface(). Here, we will modify the contours argument to display contour lines at the top/below the 3D graph depending on the view of the user. By setting highlight color to be red and setting project, when we move our cursor to each contour line, it maps the contour directly to the graph.
We can approach almost any other numeric matrices that displays geographical information using this package. With the smooth transformation, animation, and easy-to-use convenience, it would largely improve users’ experience.
fig <- plot_ly(z = ~volcano) %>% add_surface(
# configuring contours mapping
contours = list(
z = list(
show=TRUE,
usecolormap=TRUE,
highlightcolor="red",
project=list(z=TRUE)
)
)
)
fig
Other than geographical data, we can also apply the approach on kernel density. Here we are using the ‘Old Faithful Geyser Data’, which is a version of the eruptions data from the ‘Old Faithful’ geyser in Yellowstone National Park, Wyoming. The dataset has 299 observations on 2 variables: duration (numeric eruption time in minutes) and waiting (numeric waiting time for this eruption).
We used kde2d to take duration as the x-coordinate, waiting as y-coordinate, and produce a Two-dimensional kernel density estimation, where z is the estimation.
kd <- with(MASS::geyser, MASS::kde2d(duration, waiting, n = 50))
# plot using plotly by taking the kernel density created
fig <- plot_ly(x = kd$x, y = kd$y, z = kd$z) %>% add_surface(
contours = list(
z = list(
show=TRUE,
usecolormap=TRUE,
highlightcolor="red",
project=list(z=TRUE)
)
)
) %>% layout(
# label the axis using layout
scene = list(
xaxis = list(title = "duration"),
yaxis = list(title = "waiting" ),
zaxis = list(title = "Estimate")
)
)
fig
50.3.2 Tri-surface Plot
A Tri-Surface Plot is a type of surface plot, created by triangulation of compact surfaces of finite number of triangles which cover the whole surface in a manner that each and every point on the surface is in triangle. The intersection of any two triangles results in void or a common edge or vertex. This type of plot is created where the evenly sampled grids are restrictive and inconvenient to plot.
50.3.2.1 Basic Tri-Surf plot using plotly
Since the tri-surface plot is constructed with triangular surfaces, we start with the simplest geometry object to plot in 3D - a regular tetrahedron:
50.3.2.2 More complex tri-surface plot
We can use use plotly
to create more complex tri-surf plot with proper data set.
The following is an example, using the dataset to plot maps of given countries
(North America for this specific example):
data(wrld_simpl)
map1 <- subset(wrld_simpl,
NAME %in% c("Canada", "Greenland", "Mexico", "United States"))
## DEL model (like TRI in silicate)
delmesh <- anglr::globe(anglr::DEL(map1, max_area = 0.5))
mesh <- as.mesh3d(delmesh)
# plot point cloud
x <- mesh$vb[1,]
y <- mesh$vb[2, ]
z <- mesh$vb[3,]
m <- matrix(c(x,y,z), ncol=3, dimnames=list(NULL,c("x","y","z")))
# colours in z don't make sense here, need to map object aesthetics above
zmean <- apply(t(mesh$it),MARGIN=1,function(row){mean(m[row,3])})
library(scales)
facecolor = colour_ramp(
brewer_pal(palette="RdBu")(9)
)(rescale(x=zmean))
suppressPlotlyMessage(plot_ly(
x = x, y = y, z = z,
i = mesh$it[1,]-1, j = mesh$it[2,]-1, k = mesh$it[3,]-1,
facecolor = facecolor,
type = "mesh3d"
))
50.4 Conclusion
In this tutorial, we introduced how to use the rgl package and its’ relative functions to create 3D scatter plots, using plotly to create interactive 3D surface plots and more complex tri-surface plots. We can see that plotting scattered plots in 3D can be very useful such that we can fit a regression plane for it for future analysis. Through plotly we can also create very smooth 3D interactive visualizations. However, it is not to say that 3D excels 2D plots, in most cases we need to makes choices based on the specific case and scenario. Sometimes 3D plots overcomplicates the pattern and information that could be directly shown in 2D plots.