Vdgraph: A Package for Creating Variance Dispersion Graphs

John Lawson

1 Introduction

Response surface methods consist of (1) experimental designs for collecting data to fit an approximate relationship between the factors and the response, (2) regression analyses for fitting the model and (3) graphical and numerical techniques for examining the fitted model to identify the optimum. The model normally used in response surface analysis is a second order polynomial, as shown in Equation ((1)).

\[y=\beta_0+\sum_{i=1}^{k} \beta_i x_i +\sum_{i=1}^{k} \beta_{ii} x_i^2+\sum \sum_{i<j}^{k} \beta_{ij}x_ix_j \label{eq:eq1} \tag{1} \]

The fitted equation is examined in order to predict the factor coordinates of the maximum or minimum response within the experimental region, or to simply explore the relationship between the factors and response.

Since it is not known in advance what neighborhood will be of most interest in the design space, a desirable response surface design will be one that makes the variance of a predicted value as uniform as possible throughout the experimental region. Standard response surface designs, such as the uniform precision central composite design, are constructed so that the variance of a predicted value will be near constant within a coded radius of one from the center of the design.

One way to visualize the uniformity of the variance of a predicted value for designs with more than two factors is to use the variance dispersion graph proposed by Myers et al. (1992).

2 Variance of a predicted value

The variance of a predicted value at a point \((x_1,\cdots,x_k)\) in the experimental region is given by Equation ((2)) \[\label{eq:eq2} \text{Var}[\hat{y}(\mathbf{x})]=\sigma^2 \mathbf{x}^{\prime}(\mathbf{X}^{\prime} \mathbf{X})^{-1} \mathbf{x} \tag{2} \] where \(\mathbf{X}\) is the design matrix for the quadratic model in Equation ((1)), \(\sigma^2\) is the variance of the experimental error, and \[\mathbf{x}=[1,x_1,\cdots,x_k,x_1^2,\cdots,x_k^2,x_1x_2,\cdots]\] is a vector valued function of the coordinates (of the point in the experimental region) whose elements correspond to the columns of the design matrix \(\mathbf{X}\).

Table 1: Face Center Cube Design or \(3^2\) Design
Run	\(x_1\)	\(x_2\)
1	-1	-1
2	1	-1
3	-1	1
4	1	1
5	-1	0
6	1	0
7	0	-1
8	0	1
9	0	0

For the face-centered cube design, or \(3^2\) design shown in Table 1, Figure 1 is a contour plot of the scaled variance of a predicted value in the range of \(-1.5\leq x_1 \leq1.5, -1.5\leq x_2 \leq1.5\). The scaled variance of a predicted value is \(N\text{Var}[\hat{y}(\mathbf{x})]/\sigma^2\), where \(N\) is the number of points in the experimental design.

graphic without alt text — Figure 1: Contour plot of \(N\text{Var}[\hat{y}(\mathbf{x})]/\sigma^2\)

As seen in Figure 1, the variance of a predicted value increases faster along the line \(x_2=0\) than along the line \(x_2=-x_1\). This is easy to visualize in two dimensions, but would be more difficult to see in higher dimensions.

3 Variance dispersion graphs

A variance dispersion graph allows one to visualize the uniformity of the scaled variance of a predicted value in multidimensional space. It consists of three curves: the maximum, the minimum and the average scaled variance of a predicted value on a hypersphere. Each value is plotted against the radius of the hypersphere. Figure 2 shows the variance dispersion graph of the design shown in Table 1.

In this figure it can be seen that the maximum scaled variance of a predicted value is near 14 at a radius of 1.4 in coded units, while the minimum scaled variance is less than 10 at the same radius. This is the same phenomenon that can be seen in the contour plot of the scaled variance of a predicted value shown in Figure 1. The path of the maximum and minimum variance through the design space will be determined by the design and may not follow straight lines as shown in the specific example in this contour plot.

Unlike the contour plot of the scaled prediction variance, the variance dispersion graph has the same format for a \(k\) dimensional response surface design as for a two dimensional design like Table 1.

Recent textbooks such as Montgomery (2005), Myers et al. (2009), and Lawson (2010) illustrate variance dispersion graphs as a tool for judging the merits of a response surface design. These graphs can be produced in commercial software such as SAS ADX (see SAS Institute Inc. 2010) and Minitab (see Minitab Inc. 2010) by using a downloadable macro (see Santiago 2009).

4 The Vdgraph package

Vining (1993a), and Vining (1993b) published FORTRAN code for creating variance dispersion graphs. Vining’s code obtains the maximum and minimum prediction variance on hyperspheres using a combination of a grid search and Nelder-Mead search as described by (Cook and Nachtsheim 1980). The package Vdgraph (Lawson 2011) incorporates this code in R functions that make the graphs.

The package includes the function Vdgraph for making a variance dispersion graph of one design and the function Compare2Vdg for comparing the variance dispersion graphs of two designs on the same plot. The package also includes several minimum run response surface designs stored as matrices. These include Hartley’s small composite design for 2 to 6 factors, Draper and Lin’s small composite design for 5 factors, the hexagonal rotatable design for 2 factors and Roquemore’s hybrid designs for 3 to 6 factors.

5 Examples

The first example shown below illustrates the use of the R function Vdgraph to make variance dispersion graphs of a three factor Box-Behnken design created by the bbd function in the R package rsm (see Lenth 2009).

> library(rsm)
> BB.des3 <- bbd(3)
> Vdgraph(BB.des3)
number of design points= 16 
number of factors= 3 
          Radius   Maximum   Minimum  Average
 [1,] 0.00000000  4.000000  4.000000  4.00000
 [2,] 0.08660254  3.990100  3.990067  3.99008
 [3,] 0.17320508  3.961600  3.961067  3.96128
 [4,] 0.25980762  3.918100  3.915400  3.91648
 [5,] 0.34641016  3.865600  3.857067  3.86048
 [6,] 0.43301270  3.812500  3.791667  3.80000
 [7,] 0.51961524  3.769600  3.726400  3.74368
 [8,] 0.60621778  3.750100  3.670067  3.70208
 [9,] 0.69282032  3.769600  3.633067  3.68768
[10,] 0.77942286  3.846100  3.627400  3.71488
[11,] 0.86602540  4.000000  3.666667  3.80000
[12,] 0.95262794  4.254106  3.766067  3.96128
[13,] 1.03923048  4.633600  3.942400  4.21888
[14,] 1.12583302  5.166116  4.214067  4.59488
[15,] 1.21243557  5.881600  4.601067  5.11328
[16,] 1.29903811  6.812500  5.125000  5.80000
[17,] 1.38564065  7.993638  5.809067  6.68288
[18,] 1.47224319  9.462100  6.678067  7.79168
[19,] 1.55884573 11.257600  7.758400  9.15808
[20,] 1.64544827 13.422175  9.078067 10.81568
[21,] 1.73205081 16.000000 10.666667 12.80000

The result from the first example (shown above) includes a listing of the coordinates of the plot and the graph shown in Figure 3.

The second example illustrates the use of Compare2Vdg by comparing the variance dispersion graph of Draper and Lin’s small composite design for 5 factors (SCDDL5) (Draper and Lin 1990) with Hartley’s Small Composite Design (SCDH5) (Hartley 1959). Hartley’s small composite design requires only 28 runs by utilizing a \(\frac{1}{2}\) fraction of the factorial portion of the design.

Table 2: Draper and Lin”s Small Composite Design for 5 Factors
Run	\(x_1\)	\(x_2\)	\(x_3\)	\(x_4\)	\(x_5\)
1	1	-1	1	1	1
2	1	1	-1	-1	-1
3	-1	1	1	-1	1
4	1	-1	1	-1	1
5	1	1	-1	1	1
6	1	1	1	-1	-1
7	-1	1	1	1	-1
8	-1	-1	1	1	-1
9	-1	-1	-1	-1	1
10	1	-1	-1	1	-1
11	-1	1	-1	1	1
12	-1	-1	-1	-1	-1
13	\(-\alpha\)	0	0	0	0
14	\(\alpha\)	0	0	0	0
15	0	\(-\alpha\)	0	0	0
16	0	\(\alpha\)	0	0	0
17	0	0	\(-\alpha\)	0	0
18	0	0	\(\alpha\)	0	0
19	0	0	0	\(-\alpha\)	0
20	0	0	0	\(\alpha\)	0
21	0	0	0	0	\(-\alpha\)
22	0	0	0	0	\(\alpha\)
23	0	0	0	0	0

Although Draper and Lin’s design (shown in Table 2 with \(\alpha=1.86121\)) further reduces the number of runs to 23, by substituting a 12 run Plackett-Burman design in the factorial portion, its variance dispersion graph reveals that the variance of a predicted value is not nearly as uniform as it is for the Hartley’s design.

> data(SCDH5)
> data(SCDDL5)
> Compare2Vdg("Hartley's Small Composite-5 fac",SCDH5,
> +"Draper and Lin's Small Composite-5 fac",SCDDL5)

As seen in Figure 4, Hartley’s small composite design for 5 factors is rotatable since the three blue curves for the max, min and average scaled prediction variance coincide. The scaled variance of a predicted value for Hartley’s design is near the minimum scaled variance of a predicted value for Draper and Lin’s design throughout the experimental region.

6 Acknowledgement

I would like to thank the editor and two reviewers for helpful suggestions that improved this article and the package Vdgraph.

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Vdgraph: A Package for Creating Variance Dispersion Graphs

1 Introduction

2 Variance of a predicted value

3 Variance dispersion graphs

4 The Vdgraph package

5 Examples

6 Acknowledgement

CRAN packages used

CRAN Task Views implied by cited packages

Note

References

Reuse

Citation