Altopt: an R Package for Optimal Experimental Design of Accelerated Life Testing

The R package ALTopt has been developed with the aim of creating and evaluating optimal experimental designs of censored accelerated life tests (ALTs). This package takes the generalized linear model approach to ALT planning, because this approach can easily handle censoring plans and derive information matrices for evaluating designs. Three types of optimality criteria are considered: D-optimality for model parameter estimation, U-optimality for reliability prediction at a single use condition, and I-optimality for reliability prediction over a region of use conditions. The Weibull distribution is assumed for failure time data and more than one stress factor can be specified in the package. Several graphical evaluation tools are also provided for the comparison of different ALT test plans.


Introduction
Accelerated life testing (ALT) is commonly used for obtaining a product's failure time data by subjecting it to elevated stress conditions, such as temperature, humidity, and voltage.As a result, the product fails in a shorter time period than would be expected under normal stress conditions.The failure data obtained from ALTs can then be extrapolated to the normal use stress level to estimate the product's lifetime distribution.Nelson (2005a,b) provide a comprehensive review of ALT papers up to 2005.
To avoid poor experimental results and to obtain more accurate inference on acceleration model and on reliability prediction, it is necessary to have an effective ALT test plan.A well-designed ALT test plan often aims to achieve some statistical optimality.However, conventional experimental designs (e.g., factorial designs) are not effective as ALT test plans because of the following features of ALTs: • Extrapolation -Test stress levels are typically higher than the normal use stress levels.As failure time data will be collected at these higher stress levels, extrapolating them to the normal use stress level is needed for reliability prediction.Nonlinear relationships between failure time and stress levels are also expected.
• Non-normal distributions of failure times -The failure time distribution is typically positively skewed, e.g., the Weibull distribution.
• Censoring of failure time data -Censoring occurs when the exact failure times of test units are not observed.There are several reasons for censoring.In some cases, test units do not fail by the end of the test period, in which case the data becomes right-censored.In other cases, test units are are periodically inspected, so the only information available is the time interval of failure, while the exact failure time is unknown.The latter case is called interval-censoring.
In this article, we introduce an R package, ALTopt (Seo and Pan, 2015), that constructs optimal test plans for ALTs with right-and interval-censored data.This package is based on the work done by Monroe et al. (2011) and Yang and Pan (2013), where generalized linear models (GLMs) were used to model censored ALT data.

Optimal designs of ALT Optimality criteria
The ALTopt package accommodates three optimality criteria: D-optimality, U-optimality and Ioptimality.A D-optimal design minimizes the generalized variance of parameter estimates, while a U-optimal or I-optimal design minimizes, respectively, the (average) variance of response prediction at a single use condition or over a region of use conditions.
In the GLM context, the D-optimal design is defined by The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 Let n be the number of test units and p be the number of of model parameters.Matrix X(ξ) is the n × p model matrix constructed by expanding a design matrix to include all regression terms in the chosen model form, and matrix W is the n × n diagonal matrix of weights that depends on the GLM formulation used.
Using the same notation, the U-optimal and I-optimal designs can be defined, respectively, as and where x use is the single use condition, Ω is the region of use conditions, and S Ω is the area of use region.
For GLMs, the weights W are functions of the regression coefficients in linear predictor.Therefore, the information matrix contains unknown model parameter values, implying that the choice of these unknown values also affects the optimal design (see Johnson and Montgomery (2009) for more details).
In this article, we assume that these parameter values are pre-specified.They are referred to as the planning values by Meeker and Escobar (1998).

GLMs for ALT
A function that links failure time and stress variables is needed in order to extrapolate the results obtained in the test region to the use region.The GLM formulation for ALT is built upon the Cox's proportional hazard (PH) assumption.This section provides the derivation of these formulations for right-censored and interval-censored ALT data.

The Cox's proportional hazard model
The PH model assumes that, given the vector of explanatory variables x, the hazard function of failure time is given by h where h 0 (t) is called the baseline hazard function and β is a vector of regression coefficients.Note that the baseline hazard is a function of time only.From Eq. ( 1) we can derive that where H(t, x) is the cumulative hazard function and H 0 (t) is the baseline cumulative hazard function.
It is also easy to show that a reliability function is given by where R(t, x) is the reliability function and R 0 (t) = exp(−H 0 (t)) is the baseline reliability function.
The baseline hazard function of a Weibull distribution is given by h 0 (t) = λ 0 αt α−1 , where λ 0 is called the intrinsic failure rate and α is the shape parameter of Weibull distribution.By Eq. ( 1), the hazard function of Weibull distribution can be expressed as h(t, x; β) = λ 0 αt α−1 e x β and, by Eq. ( 2), its cumulative hazard function is as H(t, x) = λ 0 t α e x β .

GLM for right-censored failure time data
With the proportional hazard assumption, the failure density function is given by For a failure time data set that includes right-censored survival times, each observation can be expressed as a pair (t i , c i ), i = 1, 2, . . ., n, where t i is either a failure time or censoring time and c i is an indicator variable, which is 1 if the i th unit failed and 0 if it has not failed.Thus, the likelihood function is given by The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 From Eqs.
(1) and (3) the log-likelihood function can be written as Note that the last two terms of the right hand side of Eq. ( 4) is the same as the kernel of the log-likelihood function of n independent Poisson distributed random variable with mean µ i .The first two terms do not depend on the parameter β.Therefore, the maximum likelihood estimator β of ( 4) is similar to the estimator that maximizes the log-likelihood function of Poisson distributions.If the indicator variable c i is treated as from a Poisson distribution with mean µ i , then the GLM formulation becomes • The response variables, c i 's, are independently sampled from Poisson(µ i ); • The linear predictor is η i = x i β; • The link function is given by ln µ i = η i + an offset term.This offset term in the link function is ln H 0 (t i ), the log transformation of the baseline cumulative hazard function.This GLM formulation is applicable for any failure time distribution with rightcensored data, as long as the PH assumption holds.
Since the log link function is the canonical link function for Poisson distribution, the asymptotic variance-covariance matrix of β is given by where W = diag σ 2 i and σ 2 i is the variance of Poisson distribution; i.e., σ 2 i = µ i = e x i β H 0 (t i ).We replace t i with its expectation, which is given by Ref. Monroe et al. (2011) has shown that where Φ is the failure time distribution and t c is the censoring time.
For a Weibull distribution with the known shape parameter α, it follows that where β 0 = ln λ 0 which plays a role of the intercept term in the linear predictor.

GLM for interval-censored failure time data
For interval-censored data, the whole testing period is divided into multiple time intervals such as ∞), and failures are expected to occur within one of these intervals.Define the failure probability of the i th test unit within the j th interval to be and the conditional probability of surviving at the beginning of the j th interval but failing within the j th interval as Define an indicator variable, r, such that r ij = 0 if the i th test unit does not fail within the j th interval and r ij = 1 if the i th test unit fails.Suppose that there are n items, then the number of observations is n × (k + 1) observations.For example, The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 The likelihood function can be expressed as which is equivalent to where . Therefore, s ij = 1 if the failure of the i th test unit occurs at a time after the j th interval and s ij = 0 if the failure of the i th test unit occurs within or before the j th interval.
The likelihood function of Eq. ( 5) has the same form as the likelihood function of independent binomial random variables.We treat r ij as a binomial random variable with probability π ij and sample size m ij = r ij + s ij .The data set can be presented as a series of quadruplets: By Eq. ( 3) it becomes .
Applying the natural logarithm function twice yields The second term of the right hand side of Eq. ( 7) does not depend on the regression coefficient β, thus Eq. ( 7) is a complementary log-log link function with an offset term.We can treat r ij 's as independent random variables that follow a binomial distribution with the probability parameter π ij and sample size m ij , and the GLM formulation is written as • The response variables, r ij 's, are distributed as independent Binomial(m ij , π ij ); • The linear predictor is η i = x i β; • The link function is given by ln − ln(1 − π ij ) = η i + an offset term.
Since the log-log link is not a canonical link for the binomial distribution, we need to introduce ∆ = diag {dθ i /dη i } in the weight matrix where θ i is the natural location parameter of the binomial distribution; i.e., Then, the asymptotic variance-covariance matrix of β is given by where X * (ξ) = X(ξ) ⊗ 1 k+1 and V = diag{σ 2 ij }.Note that, instead of using X(ξ), the original model matrix, X * (ξ), which is a matrix of size n(k + 1) × p, is used.Each row of X(ξ) is repeated (k + 1) times in X * (ξ) because each test unit has (k + 1) intervals.
The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 In a binomial distribution, , which includes the random variable m ij .Replacing m ij with its expectation, the weight matrix becomes Assuming a Weibull distribution for a product's lifetime, we have R(t, x) = exp(−H(t, x)) = exp(−λ 0 t α e x β ) = e −t α e β 0 +x β .Substituting it into (6) yields Assume all time intervals have the same length, ∆t.Then, We also have Substituting Eq. ( 9) and Eq. ( 10) into Eq.( 8) yields the weights matrix for the interval-censored Weibull failure time data.

Introduction to the package ALTopt
The main purpose of the ALTopt package is to construct D-, U-, and I-optimal ALT test plans.Two main functions, altopt.rcand altopt.ic,are developed respectively for the right-censoring and interval-censoring cases.The following assumptions are required for using this package: • Failure time data follows the Weibull distribution where the shape parameter is specified by the user.
• Log-linear functions are used to model the relationship between a failure time distribution parameter and the stress factors.
• For interval-censored data, all the intervals have the same length.
Lastly, the package can accommodate many stress factors, but 5 or fewer is recommended for computational efficiency.
This package also provides two functions for evaluating existing test plans -alteval.rcand alteval.ic.These functions can be used for comparing test plans generated by this ALTopt package or any other methods.Graphical displays of prediction variance are made available.These plotting features enhance the usefulness of this package for comparing and selecting test plans.
• N -The number of test units.
• nf -The number of stress factors.
• alpha -The value of the shape parameter of Weibull distribution.
• formula -The object of class 'formula' expressing the linear predictor model.
• coef -The numeric vector containing the coefficients of each term in formula.
• useCond -The numeric vector of use condition.It should be provided when optType is "U".The length of the vector should be same as the number of stress factors.
• useLower -The numeric vector of lower bound of use region in coded unit.It should be provided when optType is "I".The length of the vector should be the same as the number of stress factors.
• useUpper -The numeric vector of upper bound of use region in coded unit.It should be provided when optType is "I".The length of the vector should be the same as the number of stress factors.
• nOpt -The number of repetitions of optimization processes.The default value is 1.If it is larger than 1, each optimization process starts from randomly chosen design points in the design region and each solution may slightly differ.The output shows the best solution overall.
• nKM -The number of repetitions of k-means clustering, which is used to generate the optimal design clustered by kmeans.The default value is 30.
• nCls -The number of clusters used for k-means clustering.If not specified, it is set as the number of parameters in the linear predictor model.
We use the function stats::optim with the "L-BFGS-B" method to perform optimization.This function allows box constraints on design variables.In our case, we have a cuboidal design region where the levels of each stress factor are coded to be between 0 and 1.More details about the "L-BFGS-B" method are available in Byrd et al. (1995).
The output of these functions are given as a list with the following components: • call -The matched call.
• opt.design.rounded-The optimal design clustered by rounding.
• opt.value.rounded-The objective function value of opt.design.rounded.
• opt.design.kmeans-The optimal design clustered by k-means algorithm.
The procedure begins by generating an initial test plan with N design points, which are randomly selected from possible points in the design region.For example, if we have 100 test units and 2 stress factors the optimization process begins from 100 randomly chosen initial points, which spread out over the design region.Throughout the optimization procedure, each of these 100 points converges to its own optimal location.to create a practical test plan, it is sometimes necessary to enforce some clustering procedure to reduce the number of distinct design points.Two clustering methods are implemented in the package.When the design points are very close, the simple method of rounding (to the 3 rd decimal place) the stress values is effective and straightforward.When there are still too many design points, the alternative is to use k-means clustering, which requires to specify the number of clusters, nCls.By carefully selecting the number of clusters, it is possible to reduce the number of distinct design points without significantly affecting the value of the objective function.The final recommended test plans are provided by the elements, opt.design.roundedor opt.design.kmeans,presented by a table containing each design point location and the number of test units allocated at each design point.The corresponding objective function values of these plans are also shown from opt.value.roundedand opt.value.kmeans.
The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 The design.plotfunction displays the recommended test plan as a bubble plot on a two-dimensional design region of user-specified stress factors.The size of each bubble represents the relative size of test unit allocations.The syntax and arguments of design.plotare as follows: design.plot(design,xAxis, yAxis) • design -A data frame containing the coordinates and the test unit allocation at each design point.
The elements, opt.design.roundedor opt.design.kmeans, of an output created by altopt.rcor altopt.iccan be given for this argument directly and any other design with the same form of those can also be given.
• xAxis -The name of the factor to be displayed in x axis.
• yAxis -The name of the factor to be displayed in y axis.

Evaluating ALT test plans
This package provides several methods to evaluate an ALT test plan.The first method is the numerical evaluation of a given test plan using alteval.rcor alteval.ic.These functions return the value of the objective function of the test plan.The syntax are as follows: alteval.rc (designTable, optType, tc, nf, alpha, formula, coef, useCond, useLower, useUpper) alteval.ic(designTable, optType, t, k, nf, alpha, formula, coef, useCond, useLower, useUpper) Similar to the design.plotfunction, the existing test plan is specified in the argument designTable.The other arguments of alteval.rc and alteval.icare similar to the arguments in altopt.rcand altopt.ic.
ALTopt also provides three different graphs for evaluating a test plan -the prediction variance (PV) contour plot, the fraction of use space (FUS) plot, and the variance dispersion of use space (VDUS) plot.These graphical tools are useful when visualizing the prediction variance throughout the entire use-space region (Myers et al., 2009, chap. 8).The PV contour plot displays the contours of the estimated prediction variance from the design region to the use region of a two-dimensional userspecified stress factor space.Functions pv.contour.rcand pv.contour.icgenerate the PV contour plot of ALT test plan with right and interval censoring, respectively.The FUS plot is an extension of the fraction of design space (FDS) proposed by Zahran et al. (2003).The vertical axis of FUS plot is the fraction of the use space region that has prediction variance less than or equal to the given values in the horizontal axis.Functions pv.fus.rc and pv.fus.iccreate the FUS plot of right and interval censoring ALT plans, respectively.In addition, the FUS curves of multiple designs can be overlaid on one graph by using compare.fus,so these designs can be compared graphically.The VDUS plot is an extension of the variance dispersion graphs (VDGs) of Giovannitti-Jensen and Myers (1989) to the cuboidal use space region.It shows plots of minimum, average and maximum prediction variance from the center to the boundary of the use region.The comparison of multiple VDUS is also available through compare.vdus.The syntax and arguments of these functions are omitted here, because they are similar to previously described functions.

An example with two stress factors and right censoring
In this section, we demonstrate the use of ALTopt using the right-censored ALT data set from Yang and Pan (2013).In this experiment, an ALT of 100 test units is conducted with two stress factorstemperature and humidity.The lowest and highest stress levels in the test region are (60 °C, 60 %) and (110 °C, 90 %), respectively.The normal use condition is (30 °C, 25 %), while the typical use region has the range from (20 °C, 20 %) to (40 °C, 30 %).The natural stress variables of these two factors are defined by S 1 = 11605/T, where T is the temperature in degrees Kelvin (i.e., temp °C+ 273.15), and S 2 = ln(h), where h is the relative humidity.These values are assigned to the following variables: The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 Next, we apply a coding scheme on these natural variables so that the highest stress level becomes (0, 0) and the lowest stress level becomes (1, 1).
Here, x 1 and x 2 are the coded stress variables of S 1 and S 2 , respectively.The ALTopt package provides a utility function, convert.stress.level, to convert the natural stress level to the coded stress level, and vice versa.The use condition and the use stress region are accordingly coded as follows: R> library(ALTopt) R> (useCond <-as.We assume that the failure times follow an exponential distribution, i.e., alpha = 1, and the prespecified linear predictor is given by Suppose the total testing time is 30 time units.The D-optimal test plan is generated by the following lines of code:  While the formula does not include the intercept term explicitly, the value of the intercept parameter still needs to be specified (in this case, it is 0).From the final design output, we noticed that the designs generated by rounding and clustering are almost the same.
We can also generate the U-optimal and I-optimal designs using the following lines of code: The R Journal Vol.XX/YY, AAAA 20ZZ ISSN 2073-4859 From Figure 1, one can see that the Uand I-optimal test plans resemble each other, while the Doptimal test plan is very different from the other two.This is expected because the objective functions of Uand I-optimal designs involve the variance of reliability prediction, while the D-optimal design involves the variance of parameter estimation.From the Uand I-optimal test plans, it is noticeable that a large number of test units is allocated at the lowest stress level.This type of test unit allocation scheme is common in ALTs (e.g., Meeker and Nelson, 1975).
To compare the D-optimal test plan and the U-optimal test plan, the pv.contour.rcfunction generates the contour plot of prediction variance using the following lines of code: Figure 2 shows that the U-optimal test plan has lower prediction variance than the D-optimal test plan at the normal use condition.The FUS and VDUS plots can also be used for further comparison of these test plans.These plots are shown in Figure 3.

FUS of list(fusDR, fusUR)
Fraction of Use Space Prediction Variance  compare.vdus(vdusDR, vdusUR) Figure 3 shows that the U-optimal test plan performs better in majority of the use-space region with respect to the prediction variance.
Finally, the convert.stress.levelfunction is useful for converting the U-optimal test plan to the natural stress level conditions.R> convert.stress.level(NdesLow, NdesHigh, stand

Figure 1 :
Figure 1: Design plots of (a) D-optimal, (b) U-optimal and (c) I-optimal designs with right censoring drawn by design.plotfunction.

Figure 2 :
Figure 2: Prediction variance contour plots of (a) D-optimal and (b) U-optimal designs with right censoring drawn by pv.contour.rcfunction.

Figure 3 :
Figure 3: Comparison of D-optimal and U-optimal designs with right censoring using (a) FUS plot and (b) VDUS plot.