A library of common geometric shapes can be used to train our brains for understanding data structure in high-dimensional Euclidean space. This article describes the methods for producing cubes, spheres, simplexes, and tori in multiple dimensions. It also describes new ways to define and generate high-dimensional tori. The algorithms are described, critical code chunks are given, and a large collection of generated data are provided. These are available in the R package geozoo, and selected movies and images, are available on the GeoZoo web site (http://schloerke.github.io/geozoo/).
This paper describes how to build a library of high-dimensional geometric shapes: cubes, spheres, simplexes, and tori. Data describing numerous 4D polytopes and polyhedra generated by other researchers are included in the library, a single location to describe the many different object structures. The purpose is to enable people to train their brains for understanding data structures residing in high-dimensional Euclidean space. This work extends the work described in (Cook 1997) which concentrated on samples from statistical distributions.
The geozoo package in R contains the code to create the geometric shapes. Code fragments, describing the key components of the algorithms for generating the shapes, are included in this paper. The shapes in the library are best viewed using the dynamic graphical method called a tour (Asimov 1985; Buja et al. 2005; Cook, E. K. Lee, A. Buja, and H. Wickham 2007), such as that available in GGobi (Swayne et al. 2003) and the tourr R package (Wickham et al. 2011).
The structure of the paper is that basic shapes are described first followed by more complex shapes, in this order, cubes, spheres, simplexes, polyhedra, polytopes, and tori.
Cubes are the first shape that a person should examine when starting to learn about higher dimensions. Cubes are relatively simple to understand: they have orthogonal, uniform length sides and are convex shapes. A 0-D cube is a single point. A 1-D cube is a line segment. A 2-D cube is a square and a 3-D cube is a box.
The 4-D cube may be hard to imagine, partly because we are accustomed to describing our physical world using only three dimensions. The leap to 4-D is more understandable after watching the movie “Flatland” (Martin 1965) or reading the novella of the same name (Abbott 1884). In “Flatland”, the world is 2-D and characters struggle with the concept of 3-D.
Working from this name, we might think of our world as “Boxland”: we live in 3-D and struggle with the concept of higher dimensions. Shadows created by light sources help perceive the third dimension. In Flatland, the inhabitants see only 1-D line segments. For a Flatlander who has never seen the world we live in, the third dimension is hard to understand. Similarly, for inhabitants of our world, it might seem daunting to imagine the fourth dimension. But it’s not that difficult!
Figure 1 shows the evolution of the cube from 2-D to 5-D. Each
figure is a 2-D projection of a wireframe cube from two to five
dimensions. To increase the dimension of a cube, replicate and shift a
cube of one dimension lower along a new orthogonal axis, connecting the
corresponding vertices. The 3-D cube grows from
2
|
|
|
|
Vertices of a high-dimensional cube can be considered as all
permutations of the binary digits (0 and 1) in
The two different ways to define a high-dimensional cubes leads to two
different methods to create a
1-D:
row # | 1 | edges |
---|---|---|
1 | 0 | 2 |
2 | 1 |
2-D:
row # | 1 | 2 | edges | |
---|---|---|---|---|
1 | 0 | 0 | 2 | 3 |
2 | 1 | 0 | 4 | |
3 | 0 | 1 | 4 | |
4 | 1 | 1 |
3-D:
row # | 1 | 2 | 3 | edges | ||
---|---|---|---|---|---|---|
1 | 0 | 0 | 0 | 2 | 3 | 5 |
2 | 1 | 0 | 0 | 4 | 6 | |
3 | 0 | 1 | 0 | 4 | 7 | |
4 | 1 | 1 | 0 | 8 | ||
5 | 0 | 0 | 1 | 6 | 7 | |
6 | 1 | 0 | 1 | 8 | ||
7 | 0 | 1 | 1 | 8 | ||
8 | 1 | 1 | 1 |
Method 1: Recursively double a lower-dimensional cube.
Using the standard coordinate system, the base is 0 and 1. After
establishing the base, we recursively double the base in the first
column(s), and add an additional column containing a 0 in the first
half of the rows and a 1 in the second half of the rows. The process
is repeated
cube_iterate <- function(p) {
if (p == 1) {
return(rbind(0, 1))
}
lower_dim_cube <- cube_iterate(p - 1)
rbind(
cbind(lower_dim_cube, 0),
cbind(lower_dim_cube, 1)
)
}
Method 2: Generate all permutations of
This method takes advantage of an existing function in R,
expand.grid
. It produces all permutations by generating the
Cartesian product of a set of vectors. For our purposes, the number
of columns is not fixed, so we use do.call
, to convert a function
call of the form x(a, b, c)
to do.call(x, list(a, b, c))
,
allowing specification of an arbitrary number of arguments.
cube_permute <- function(p) {
as.matrix(
do.call(
expand.grid,
rep(list(c(0, 1)), p)
)
)
}
The wire frame for a cube draws the edges of the cube, connecting all
points that differ in one of the values, e.g.
Method 1: Distance of 1.
The distances between all
cube_edges_length1 <- function(cube) {
p <- ncol(cube)
num_points <- 2 ^ p
from_to <- matrix(NA, nrow = num_points * p / 2, ncol = 2)
next_store_position <- 1
for (i in 1:(num_points - 1)) {
for (j in (i + 1):num_points) {
d1 <- sum((cube[i, ] - cube[j, ]) ^ 2)
if (d1 == 1) {
from_to[next_store_position, ] <- c(i,j)
next_store_position <- next_store_position + 1
}
}
}
from_to
}
Method 2: The binomial approach.
This is faster to compute than the first method because it involves
only a single loop over the cube vertices. For this approach to
work, the vertices of the cube need to have been created using the
methods described in Section [cube-vertices]. Each
vertex, that has
cube_edges_binomial <- function(cube) {
p <- ncol(cube)
num_points <- 2 ^ p
from_to <- matrix(NA, nrow = num_points * p / 2, ncol = 2)
next_store_position <- 1
for (i in 1:(num_points - 1)) {
for (j in 1:p) {
if (cube[i, j] == 0) {
from_to[next_store_position, ] <- c(i, 2 ^ (j - 1) + i)
next_store_position <- next_store_position + 1
}
}
}
from_to
}
Method 3: Binary relationships.
The final method is the most computationally efficient but the least
intuitive. Here we will use the fact that the vertices of the cube
can be represented as binary numbers, e.g.
The key insight to note is that edges connect vertices which have a
single bit flipped. For example,
library(bitops)
cube_edges_binary <- function(p) {
vertices <- 0:(2 ^ p - 1)
from_verts <- vertices[
rep(1:(2 ^ p), each = p)
]
from_to <- data.frame(
from = from_verts,
to = bitXor(from_verts, 2 ^ (0:(p - 1)))
)
from_to <- subset(from_to, from < to) + 1
from_to
}
A solid cube has points in the interior (Figure 2). It is easy to generate, using either random sampling or a fixed grid.
|
|
|
|
|
|
Method 1: Random uniform.
The R function runif
generates samples from a uniform distribution
between 0 and 1. Generating cube_solid_random
function, we use a base of 850
points, and the total number of points is capped at 50000 points for
speed of viewing.
cube_solid_random <- function(p, n = 850 * 2 ^ p) {
matrix(runif(n * p), ncol = p)
}
Method 2: Equidistant.
A solid cube can be generated using equidistant points. As with the
second vertex generation method, the expand.grid
function is used.
The input
cube_solid_grid <- function(p, n) {
grid <- list(seq(0, 1, length = n))
do.call(expand.grid, rep(grid, p))
}
There are advantages and disadvantages to the methods provided. The
first method, random uniform points, produces a solid cube that looks
more solid, but as
The “face” of a cube is a surface that is one dimension lower than that of the cube. For example, a face of a 3-D cube is a 2-D square and a face of a 4-D cube is a 3-D cube.
To generate points on the faces of a cube, points are created in all
dimensions except one. The remaining dimension is given the value
Method 1: Equidistant faces.
Equidistant faces may leverage the fact that each “face” contains
the same data. Therefore, we may calculate a single
cube_face_grid <- function(p, n = 10) {
face <- cube_solid_grid(p - 1, n)
face_n <- nrow(face)
faces <- do.call(data.frame, rep(list(X = rep(0:1, each = p * face_n)), p))
for(i in seq_len(p)) {
faces[(face_n * (i - 1) + 1):(face_n * i), -i] <- face
faces[(face_n * (i - 1) + 1):(face_n * i) + (p * face_n), -i] <- face
}
return(as.matrix(faces))
}
Method 2: Random uniform faces.
Naively creating a 3-D cube, the
|
|
|
|
|
|
|
Figure 3: Faces of a 3-D cube (top row) and a 4-D cube (bottom row), obtained by fixing one column of values to be 0 or 1, and allowing the other columns to vary freely between 0 and 1.
Unlike the equidistant faces of a cube, each random uniform face
must be different from every other face. To avoid repetitive
calculations, we leverage the fact that a
cube_face_random <- function(p, n = 850 * 2 ^ (p - 1)) {
faces <- cube_solid_random(p, 2 * p * n)
for (i in seq_len(p)) {
faces[(n * (i - 1) + 1):(n * i), i] <- 0
faces[(n * (i - 1) + 1):(n * i) + (p * face_n), i] <- 1
}
faces
}
|
|
|
|
---|
As the dimension increases, the shape of the cube appears more rounded
than square in a 2-D projection. Explaination is given in Diaconis and Freedman (1984) and is
related to the Central Limit Theorem. When we use a tour to visualize
high-dimensional data we examine low-dimensional projections. Consider
the axes for a
A sphere can be described as all points within a fixed radius (for
simplicity use 1) around a fixed point (for simplicity use
To generate points uniformly distributed on the surface of a sphere we use the following trick: first, we generate a random vector from a multivariate standard normal distribution and then normalize its length. The normalized point is now a random point on a unit sphere (Watson 1983 2.6). The top row of Figure 4 shows the results.
norm_vec <- function(x) {
x / sqrt(sum(x ^ 2))
}
sphere_hollow <- function(p, n = p * 500) {
x <- matrix(rnorm(n * p), ncol = p)
t(apply(x, 1, norm_vec))
}
|
|
|
|
|
|
Solid spheres can be generated in much the same way as solid cubes; use
random points to fill the object. While the solid cube fills the box,
the sphere’s points are inside a radius of 1 from the center. A simple
approach would be to use a rejection method: generate points in the
solid cube and discard those with radius more than 1. This is
problematic as
The approach we used is a minor modification to the method used to
generate a hollow sphere. Figure 5 illustrates the process. The
vector length is randomly sampled from a uniform distribution on
sphere_solid_random <- function(p, n = p * 500) {
sphere_hollow(p, n) * runif(n) ^ (1 / p)
}
|
|
|
Simplexes are one of the simplest objects to create and view. A
|
|
|
|
For example, a 2-D simplex has unprojected 3-D vertices at (1,0,0), (0,1,0), (0,0,1), which are reduced by the Helmert transformation to points in a 2-D equilateral triangle, (0.7071,0.4082), (-0.7071,0.4082), (0.0000, -0.8165).
helmert <- function(d) {
helmert_mat <- matrix(NA, nrow = d, ncol = d)
helmert_mat[1, ] <- rep(1 / sqrt(d), d)
for (i in 1:(d - 1)) {
helmert_mat[i + 1, ] <- c(
rep(1 / sqrt(i * (i + 1)), i),
-i / sqrt(i * (i + 1)),
rep(0, d - i - 1)
)
}
helmert_mat
}
simplex <- function(p) {
x <- diag(p)
# center simplex
x <- x - matrix(1 / p, p, p)
hm <- helmert(p)
final <- (x %*% t(hm))[, -1]
final
}
The wire frame for a simplex connects every point to every other point,
and can be computed in just two lines of code, following method 2 of the
cube vertices. simplex_wires
makes a list of all pairs of verticies
and then removes the edges that connect a vertex to itself.
simplex_wires <- function(simplex) {
wires <- do.call(
expand.grid,
list(
c(1:nrow(simplex)),
c(1:nrow(simplex))
)
)
wires[!wires[,1] == wires[,2],]
}
A polyhedron is a three dimensional object that contains straight edges and has flat faces. Our polyhedra data comes from George W. Hart’s website (Hart 2000). Hart’s website contains an extensive collection of polyhedra, ranging from Platonic Solids to Stellations of the Rhombic Triacontahedron. In our data sets, we used the information from Platonic Solids, Kepler-Poinsot Polyhedra, Archimedean Polyhedra and its duals, Prisms. The data was reformatted from VRML into XML. The vertices and wire frames from separate files were compiled into tables. Some reformatting of edges was also necessary.
Paul Bourke’s website (Bourke 1996) has equations for generating several famous objects, including the Mobius Strip, Steiner’s Roman Surface and the Klein Bottle (Figure 6). It is interesting to see how each object twists onto itself. R functions based on these equations were written to produce each object. For each shape, a set of random angles were generated to seed the equations, producing points on the surface. For some of the objects, to be displayed in its familiar form, the plot limits for each variable need to use the same global minimum and maximum values.
|
|
|
A polytope is a generalized term of a geometric shape in any
dimension. A polygon is a 2-D polytope. A polyhedron is a 3-D polytope.
A polychoron is a 4-D polytope. Beyond that, we typically use polytope
to refer to any
|
|
|
A “doughnut” torus is known as a ring torus. Paul Bourke’s website (Bourke 1990) on “The Torus and Super Torus” provides the inspiration. The website explains how the 3-D torus is made. It also contains information that we used to develop the process of building high-dimensional tori. Figure 8 shows this process: a smaller circle that follows a larger circle, creating a doughnut. The points for the torus are formed by polar coordinates.
|
|
|
|
|
|
To produce a 4-D torus, a 3-D torus is rotated around a circle into the fourth dimension (Figure 8 bottom row). This torus still has a hole in the center. The process can be thought of as a recursive circle system. For a 3-D torus, the smaller radius circle follows the larger radius circle. For a 4-D torus, a 3-D torus follows an even larger radius circle. That is, the lower-dimensional torus is shifted by a fixed distance and rotated about a new axis perpendicular to the axis of the hole. This algorithm for a ring torus has not been previously defined and will be explained in detail in the following section.
|
|
|
A ring torus has a hole in the center of the object and is generated recursively. A 2-D circle forms the base of the torus, which is defined by:
using one radius and one angle. The 3-D torus is defined by:
Notice that 3-D torus builds from the 2-D:
The steps to building a higher-dimensional torus are:
add a new dimension that equals
in all other dimensions, replace
repeat.
However, these steps are not easy to recurse. Replacing values in a formula after it has been realized is not simple. By rearranging the formulas, a better method to recurse is achieved:
The first step of the recursion, starting from the last dimension, is given as follows:
which translates to this R code:
torus<-c(
rep(cos(theta[p - 1]) * radius[p - 1], p - 1),
sin(theta[p - 1]) * radius[p - 1]
)
From this start, we recurse backwards from
for (i in (p - 1):2) {
for (j in (i - 1):1) {
torus[j] <- (torus[j] + radius[i - 1]) * cos(theta[i - 1])
}
torus[i] <- (torus[i] + radius[i - 1]) * sin(theta[i - 1])
}
The construction of a 4-D torus is shown below:
This code results in one row of data for one point on the 4-D torus for each value of angle. By varying the angle and binding the result, we get points over the surface of the torus:
finished <- rbind(finished, torus.row)
or
matrix(
do.call(rbind, as.list(
replicate(
n,
torus.row(radius, p)
)
)),
ncol = p, byrow = TRUE
)
The angles create the rings, and thus need to vary fully between 0 and
theta <- runif(p - 1, min = 0, max = 2 * pi)
The radii for the process are fixed at the start. To produce a hole in
each dimension the radii need to decrease with
radius <- 2 ^ ((p - 2):0)
This produces points fairly evenly but not uniformly spread on the surface of the torus. A different way to generate the torus is to produce the angles at set intervals, resulting in more circular patterns.
Another common hyper-torus is the flat torus. A flat torus is commonly seen expanding into infinity as a screen saver on some computers. The flat torus has multiple holes in it’s center and is easy to generate.
A flat torus is formed in pairs of dimensions, defined by a sine and
cosine of one angle, for example the circle is generated in 2-D by
|
|
|
|
|
|
|
These tori are all hollow. To create points in the interior of the tori one would randomly generate the radii length for each point.
This paper has described how geozoo generates several types of high-dimensional geometric shapes. The result is a library designed to help conceptualize objects in high-dimensional spaces. It has also led to some new geometric shape definitions.
This library, although seemingly removed from real high-dimensional data
has some strong connections. Much of our data analytic methods are based
in high-dimensional Euclidean space. Developing some visual insight into
this space can help to understand the methods that operate in higher
dimensions. Some data problems can be closely mapped to the geometric
shapes. For example, ranked data showing preferences for a fixed set of
objects, can be mapped to high-dimensional polytopes (Thompson 1993). Values in
a sample are commonly constrained to sum to a fixed number, for example,
100%, forming compositional data. This type of data lies inside a
We encourage the reader to look at the movies, and images, and download the data on the project web site, which can be accessed at http://schloerke.github.io/geozoo/. The geozoo package contains the R code to generate the geometric shapes and is available to download from CRAN (http://www.R-project.org). We especially encourage readers to experiment with creating new high-dimensional geometric shapes, or to contribute ideas and code back to this project.
This work was supported by National Science Foundation grant DMS0706949. Andreas Buja provided some helpful feedback on the work. Dianne Cook thanks the Isaac Newton Institute, Cambridge, UK for support, where she was visiting while drafting this paper.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Schloerke, et al., "Escape from Boxland", The R Journal, 2016
BibTeX citation
@article{RJ-2016-044, author = {Schloerke, Barret and Wickham, Hadley and Cook, Dianne and Hofmann, Heike}, title = {Escape from Boxland}, journal = {The R Journal}, year = {2016}, note = {https://rjournal.github.io/}, volume = {8}, issue = {2}, issn = {2073-4859}, pages = {243-257} }