Exploring Interaction Effects in Two-Factor Studies using the hiddenf Package in R

In crossed, two-factor studies with one observation per factor-level combination, interaction effects between factors can be hard to detect and can make the choice of a suitable statistical model difficult. This article describes hiddenf, an R package that enables users to quantify and characterize a certain form of interaction in two-factor layouts. When effects of one factor (a) fall into two groups depending on the level of another factor, and (b) are constant within these groups, the interaction pattern is deemed "hidden additivity" because within groups, the effects of the two factors are additive, while between groups the factors are allowed to interact. The hiddenf software can be used to estimate, test, and report an appropriate factorial effects model corresponding to hidden additivity, which is intermediate between the unavailable full factorial model and the overly-simplistic additive model. Further, the software also conducts five statistical tests for interaction proposed between 1949 and 2014. A collection of 17 datasets is used for illustration.


Introduction
The crossed, two-factor layout with one observation per factor-level combination is a simple and widely used plan.These designs arise in the context of (a) completely randomized two-factor experiments, (b) randomized complete block experiments that block on a source of nuisance variability, and (c) observational studies with two factors.If the effects of the two factors do not interact, then this design permits inference for both factors simultaneously.The following model is commonly used for data collected using these designs: where i = 1, . . ., c and j = 1, . . ., r are indices for levels of the two factors C and R, µ represents the overall mean of the outcome y ij , and the τ i and β j terms quantify the effects of factors C and R. The errors are frequently assumed independent and identically normally distributed, with constant error variance σ 2 .Since data from these designs are easy to display using two-way tables, frequently the levels of factors R and C are called "rows" and "columns," respectively.In this report (and the hiddenf manual), Factor R is referred to as the "row" or "grouped" factor, while factor C is called the "column" factor.
Model (1) assumes that the effect of the row factor on the response y ij does not depend on the level of the column factor, and vice versa.In statistical parlance this is known as the assumption of "additivity." This assumption is sometimes made out of necessity as the alternative of including the interaction term from the full factorial effects model precludes statistical inference.After computing the interaction mean square, zero degrees of freedom remain with which to estimate the error variance, σ 2 .Data arising from (1) are shown in Figure 1 panel A.
If the effect of factor C varies across levels of the factor R, then there is statistical interaction.In this case (1) is misspecified.Graphical assessment of additivity is commonly made using an interaction plot.Figure 1 panel B is an interaction plot for the cjejuni.mtxdata set included in the hiddenf package (Qiu, 2013).
Since the data in Figure 1 panel B do not exhibit parallelism, there is graphical evidence to suggest that (1) may not be a suitable model.This is problematic because inferences conducted under (1) will be incorrect if (1) fails to describe the true nature of the variables under investigation.Fortunately, many methods have been devised to assess non-additivity in these designs, including those in Tukey (1949), Anscombe and Tukey (1963), Mandel (1961), Mandel (1971), Johnson and Graybill (1972), Hirotsu (1982), Kharrati-Kopaei and Sadooghi-Alvandi (2007), Tusell (1990), Boik (1993a), Franck et al. (2013), and Malik et al. (2014).Alin and Kurt (2006) provide a review of methods available up to 2006.We return to the analysis of these and other data in subsequent sections of the paper.

The hiddenf package
This paper describes the R package hiddenf, which makes available several tools for the analysis of non-additivity in unreplicated two-way layouts.This package contributes to available computational resources by (a) providing statistical and graphical diagnostic tools within the context of hidden additivity that enable the user to go beyond overall tests of additivity towards the inferential goal of characterizing and quantifying the magnitude of interaction, and (b) providing p-value computations for five methods to detect non-additivity.Supported tests include those proposed in Tukey (1949), Mandel (1961), Kharrati-Kopaei and Sadooghi-Alvandi (2007), Franck et al. (2013), andMalik et al. (2014).The latter three are newly available in an open-source repository via the hiddenf package, which is available from the Comprehensive R Archive Network (CRAN).
Hidden additivity occurs when the levels of factor R fall into a smaller number of groups, such that within these groups the effect of factor C is constant across levels of factor R, but there is C × group interaction (Franck et al., 2013).Group membership of levels of R can be regarded as latent, unobservable random variables.Inspection of Figure 1 panel B provides an example.If one group is formed from the two lines (processing plants) that reach their peak in year 3, and another from the two lines that reach their peak in year 4, then the year and plant effects are roughly additive within these groups.
The HiddenF() function accepts an r × c data matrix as input and returns an object of the HiddenF class.Objects of this class store the p-value from the all-configurations maximum interaction F (ACMIF) test (Franck et al., 2013) for hidden additivity, the number of configurations under consideration, the configuration that achieves the maximum interaction F score (or maximum hidden additivity), and the data as a list.Several generic functions, including print(), anova(), summary() and plot(), can be applied to objects of the HiddenF class to produce output useful for the quantification and characterization of interaction.By default, the HiddenF()  The remainder of the paper is organized as follows.The C. jejuni data are first described as a relevant example.The following section reviews the testing procedure for hidden additivity.Functionality to detect hidden additivity using hiddenf is then detailed.Four other supported methods to detect non-additivity are then reviewed.Seventeen data sets are described and explored using the methods supported by hiddenf.The article concludes with a summary.

Example
We now return to the data from Figure 1 panel B. The vertical axis in this plot is the proportion of disease-resistant bacteria samples of the C. jejuni strain taken from turkey processing facilities.These data were collected over a five year period across four processing plants in North Carolina.Each line corresponds to a plant, and the year labels correspond to 2008-2012.The matrix cjejuni.mtxcontains the fractions of bacteria that are classified as the strain C. jejuni.

Testing for hidden additivity
Hidden additivity is a useful concept for the analysis of many types of data because the structure is easy to visualize and interpret.The ACMIF test (Franck et al., 2013) to detect hidden additivity assigns classical significance testing-based p-values.A combined testing and plotting approach allows the researcher to interpret the way in which column effects change across groups of the rows.
The ACMIF test works by (1) considering each placement of factor R levels into two nonempty groups, (2) testing factor C × group interaction, and (3) considering the test that provides the most evidence of interaction (and greatest additivity of C and R within groups) after correcting for multiplicity using a Bonferroni adjustment.Simulation suggests that the Bonferroni correction is not overly conservative for r ≤ 7 despite the magnitude of the correction factor 2 r−1 − 1, because the Bonferroni adjusted thresholds are remarkably similar to simulated critical thresholds for a variety of c and r values (Franck et al., 2013).If g is an index for the latent group membership such that g = 1, 2, then the factorial effects model corresponding to the hidden additivity structure is given by where µ is the overall mean, τ i and γ g are the main effects of factor C and the grouping variable, (τγ) ig represents factor C × group interaction, and β j(g) is the effect of factor R, nested within group.The ACMIF test is based on the largest F-ratio corresponding to H 0 : (τγ) ig = 0 among all assignments of R factor levels into two non-empty groups.

Characterizing hidden additivity
Objects in the HiddenF class can be used to characterize hidden additivity further with application of the generic plot(), print(), summary(), and anova() functions.The following code produces an enhanced interaction plot to help visualize hidden additivity (Figure 3): > cjejuni.out<-HiddenF(cjejuni.mtx)> plot(cjejuni.out,main="Hidden Additivity of time effects across plants", + rfactor="plant", cfactor="year", colorvec=c("blue", "red"), lwd=3, legendx=TRUE) The enhanced interaction plot in Figure 3 displays levels of factor C (year) on the horizontal axis, levels of factor R (plant) using lines that are visually distinguished by line type, and the outcome values on the vertical axis.Color is used to display which levels of factor R are assigned to which groups based on the maximal interaction F statistic among all possible configurations and (2).Note that the grouping colors appear whether or not the ACMIF p-value (test for C × group) to detect hidden additivity is significant.The default colors are black and red for the two groups, but they can be customized by supplying an argument called colorvec which is a vector of length two giving color names.Another optional argument, legendx, may be set to TRUE in order to provide a legend whose location will be determined according to where on the plot the user clicks.
Because of the ellipsis(. . .), arguments such as main (for a title) or lwd (for specifying line widths) can be passed to matplot or legend.The arguments lty, type, ylab, and xlab are utilized by the plot.HiddenF() function to produce the enhanced interaction plot and therefore are not available for customization by the user.
The print() function returns results from the ACMIF test.
> print(cjejuni.out) The ACMIF test for the hidden additivity form of interaction F=8.965 p-value =0.03309 df=4,8 (Bonferroni-adjusted for all 7 possible configurations) This output can also be obtained by typing the name of an object of class HiddenF into the console.The p-value above has been adjusted for multiplicity using the Bonferroni multiplier of 2 4−1 − 1 = 7.Additionally, the method argument in the print.HiddenF() function supports all of the methods supported by the hiddenf package discussed later.To see which of these seven possible configurations of rows in two non-empty groups leads to the greatest additivity within groups, use the generic summary() function, which returns information useful for analysis based on (2): The R Journal Vol.The output above includes the number of configurations under consideration, the smallest Bonferroni adjusted p-value among these, the identity of the rows that fall into each group and the means across levels of C for both groups.Each of these items is returned in a list (not shown).To produce an ANOVA table : > anova(cjejuni.out) The ACMIF test for the hidden additivity form of interaction Analysis of Variance For these data, the interaction between time (col) and plant group (group) accounts for more variability (43%) than any other term in the hidden additivity model.This partial coefficient of determination is computed by dividing the group × column sums of squares by the total sum of squares displayed above.The anova() function can also accommodate other tests for non-additivity by specifying the method argument as "KKSA", "Mandel" or "Tukey" (see Other Tests for Non-additivity section).
The R Journal Vol.8/1, Aug. 2016 ISSN 2073-4859 Levels of factor C can be grouped instead by transposing the input data matrix before applying the HiddenF() function.The output below suggests no statistical evidence for hidden additivity in the levels of factor C.
> cjejuni.trans.mtx<-t(cjejuni.mtx)> cjejuni.trans.out<-HiddenF(cjejuni.trans.mtx)> print(cjejuni.trans.out) The ACMIF test for the hidden additivity form of interaction F=3.63 p-value =0.8671 df=3,9 (Bonferroni-adjusted for all 15 possible configurations) Note that support for more than 20 rows is not yet included due to the exponential increase in computational demand as the number of grouped levels increases.Factors with more levels than this will be accommodated in future versions of the software.

Centering
The center option in the plot command allows the user to graph the hidden additivity plot with data centered at the row means to better see the concordance of rows with regard to column effects.This is a way of de-noising the plot.Figure 4 demonstrates this feature using data from Graybill (1954).These data will be further analyzed later.Inspection of the right panel of Figure 4 suggests that the third genotype appears to give inferior yields for rows in the red group (relative to the performance of other genotypes in these blocks), but does better for rows colored black.To produce this plot: > data(Graybill.mtx)> Graybill.out<-HiddenF(Graybill.

Hidden Additivity plot center=TRUE
Columns Factor y 1 2 3 4 Figure 4: Hidden additivity plots for the wheat yield data (Graybill, 1954).The left and right panels exhibit the uncentered and centered graphical options, respectively.

Other tests for non-additivity
The ACMIF approach Franck et al. (2013) to detect hidden additivity has been described above.The subsections below include code examples and briefly review the other four supported approaches to The R Journal Vol.8/1, Aug. 2016 ISSN 2073-4859 detect non-additivity in unreplicated studies.

One degree of freedom approach
The one degree of freedom test (Tukey, 1949) is the earliest and most widely-known approach.The following model for non-additivity corresponds to Tukey's procedure: A hypothesis test for this approach is based on the null hypothesis H 0 : ν = 0.Under the null, (3) reduces to (1).The one degree of freedom test arises by fitting the squared fitted values from the additive model as a model term, then assessing the partial F-test for statistical significance corresponding to this term.
The TukeyPvalue() function accepts an object of class HiddenF and returns a list that contains the p-value for the one degree of freedom test and an lm object that contains the model that includes the squared predictions from (1) as described above.The method argument in the anova.HiddenF()

Rows-linear approach
An extension of the one degree of freedom test is based on the following model (Mandel, 1961): where ij are i.i.d.N(0, σ 2 ).The test for rows-linear non-additivity is based on the null hypothesis H 0 : θ i = 0 for i = 1, . . ., t.Under this null, (4) reduces to (1).
By setting µ i = µ + τ i and θ i = (φ i − 1), the response variable y ij in (4) can be represented as an intercept µ i and slope φ i that depend on effect β j .Letting i and j index "rows" and "columns" respectively leads to the phrase "rows-linear" model, which was established later (see Alin and Kurt, 2006).
The ANOVA table sum of squares associated with the θ i term is The residual sum of squares has the following form: The test statistic is: The R Journal Vol.8/1, Aug. 2016 ISSN 2073-4859 The statistic F rowlin has an F distribution with (a − 1) numerator and (a − 1)(b − 2) denominator degrees of freedom under the null hypothesis.Construction of a columns-linear test is accomplished by defining a HiddenF object on the transposed data matrix and calling the MandelPvalue() in a similar fashion.
The MandelPvalue() function accepts an object of class HiddenF and returns a list that includes a p-value, sums of squares, F ratio, and degrees of freedom for the rows-linear test.The output below fails to detect interaction using Mandel's rows-linear test: > MandelPvalue(cjejuni.out) $pvalue [1] 0.9458807 $SumSq SSRow SSCol SSMandel SSE SSTot 0.036735000 0.187530000 0.009849526 0.245740474 0.479855000

Error mean square subtable comparison approach
The error mean square (MSE) subtable comparison approach (Kharrati-Kopaei and Sadooghi-Alvandi, 2007) forms an F statistic that is the ratio of residual sums of squares that arise from rows being placed into two groups, with each group containing at least two rows, Under the assumptions of additivity and homogeneous variance, F KKSA has an F distribution with (r 1 − 1)(c − 1) numerator and (r 2 − 1)(c − 1) denominator degrees of freedom under the null hypothesis, no matter which rows are placed into which group.This approach suggests non-additivity when error mean square values within the subtables are discrepant.
As with the ACMIF procedure, grouping is subjective and is accomplished by choosing one of the two factors upon which to form the groups, and then placing that factor's levels into two groups.Different groupings result in different p-values.Groups are typically chosen based on a priori knowledge, inspection of an interaction plot, or by screening several candidate groupings and applying multiplicity adjustment to the resulting p-values.The hiddenf package adopts the authors' (Kharrati-Kopaei and Sadooghi-Alvandi, 2007) suggestion of assessing the 2 r−1 − r − 1 possible groupings for factor R and Bonferroni adjusting the resulting p-values to achieve an α level test.
The KKSAPvalue() function accepts an object of class HiddenF and returns a list that contains the maximal F statistic among configurations, a Bonferroni adjusted p-value, a length r vector that indicates the groupings of levels of factor R that correspond to the calculated p-value, and numerator and denominator degrees of freedom for the test statistic.Data may be transposed in order to form subtables according to factor C. At significance level α = .05the output below does not reject the null hypothesis of additivity according to the MSE subtable comparison approach when grouping levels of factor R.

Residual clustering approach
A recent approach (Malik et al., 2014) considers non-additivity elicited due to cell values which are remote relative to the additive model.Not all cells are assumed to contribute to the interaction pattern.The method detects non-additivity by exploiting the large residuals that arise under this structure.
The R Journal Vol.8/1, Aug. 2016 ISSN 2073-4859 Execution of the residual clustering method proceeds as follows.First, residuals from ( 1) rij(1) = y ij − ŷij(1) are placed into 3 groups using k-means clustering (MacQueen, 1967) in one dimension.Then, a two degrees of freedom cluster effect is incorporated into the additive model: where κ k(ij) represents the effect of the kth cluster, k = 1, 2, 3.The subscript k(ij) indicates that the ijth residual is assigned to exactly one of the k clusters for i = 1, . . ., c and j = 1, . . ., r.Since the cluster term is suggested by the data, the partial F test corresponding to the cluster term does not exhibit a central F distribution under the null.Rather, the null distribution for this statistic may be approximated via Monte Carlo simulation for the purpose of conducting statistical inference.Critical thresholds for a variety of c and r are available (Malik et al., 2014).The hiddenf package computes pvalues and critical thresholds for user-supplied c and r using the functions MalikPvalue and MalikTab, respectively.By default N = 500 Monte Carlo replicates are used to approximate the p-value, which guarantees the standard error of the p-value SE(p − value) ≤ 0.023.
The MalikPvalue function accepts an object of class HiddenF and returns a Monte Carlo p-value, observed test statistic, and Monte Carlo sample size used in the computation.The k-means approach uses the method of Hartigan and Wong (1979) with 100 starting values per Monte Carlo iteration.The output below suggests that the C. jejuni data do not exhibit non-additivity of the form suggested by this method.This approach is invariant to data transposition.

The additivityPvalues function
The additivityPvalues function accepts an object of class HiddenF and returns p-values for each of the supported methods.

Data examples
Table 1 summarizes an investigation of 17 data sets using the methods supported by hiddenf.These examples are taken from bioinformatic, agricultural, industrial, and medical settings, and are all publicly available.To our knowledge, this is the largest collection and description of data that has been used to study non-additivity in the literature to date.Table 1 includes columns for references, labels, and p-values.Note that ACMIF c, columns-linear, and KKSA c are obtained by submitting the transposed data matrix to HiddenF().A brief description of each data set follows.
The "Liming" data arises from an experiment that explores the utility of seven types of blast furnace slags as liming material in agriculture on three types of soil (Carter et al., 1951).The data were analyzed in the context of non-additivity in Johnson and Graybill (1972) and Kharrati-Kopaei and Sadooghi-Alvandi (2007).The outcome variable is yields of corn in bushels per acre.
The R Journal Vol.8/1, Aug. 2016 ISSN 2073-4859 The "Penicillin" data set (Davies and Goldsmith, 1972) assesses variability between six samples of penicillin on 24 plates.The outcome variable is the diameter of the zone of inhibition.The data were analyzed for non-additivity in Malik et al. (2014).
The "Osmotic" data set (Kharrati-Kopaei and Sadooghi-Alvandi, 2007) explores five varieties of safflower grown in solutions with six osmotic potentials.Safflowers are an important crop due to their oil which can be used for cooking and their flowers which can be used as a substitute for saffron.The outcome variable is average root weight.
The "Fertilizer" example originally appeared in Ostle (1963) and was re-analyzed to detect nonadditivity in Kharrati-Kopaei and Sadooghi-Alvandi (2007).This example explores the ratio of dry to wet wheat in four blocks for four fertilizers.
The "Bottles" example was originally analyzed in Ott and Snee (1973) and also explored in Boik (1993b) and Kharrati-Kopaei and Sadooghi-Alvandi (2007).The study assesses the performance of a six-headed machine on five occasions.
The "Grain Yields" data measures the yield of a grain crop in bushels per acre for five levels of fertilizer on five blocks.The data come from Ostle (1963) and are re-analyzed in Kharrati-Kopaei and Sadooghi-Alvandi (2007).
The "Absorbance" data (Mandel, 1991) measures absorbancy of wood pulp of nine polysachharide concentrations from seven different laboratories.The data are also analyzed in Alin and Kurt (2006).
The "Permeability" example examines permeability of three sheets of building material on nine different days.The data appear in Hald (1952) and Giesbrecht and Gumpertz (2004).
The "Wool" data (Lentner and Bishop, 1993) examines four cleaning processes for wool from five different batches.The outcome is losses in weight in mg of the sample after cleaning.
The "Red Blood Cells" data measures the number of red blood cells counted by five doctors in ten counting chambers.The data appear in Biggs and Macmillan (1948) and were also analyzed in Boik (1993a).
The "Tukey" data set is an illustrative example that includes three rows and four columns from Tukey (1949).
The "Ethyl Alcohol" data (Osborne et al., 1913) measures the density of aqueous solutions of ethyl alcohol at six concentrations and seven temperatures.These data were also analyzed in Mandel (1971).
The "Insecticide" data (Ott and Longnecker, 2001) comes from a complete block design that measures the impact of four plots and three insecticides on the number of string bean seedlings that emerge.
The "Rubber" data (Mandel, 1961) contains data on stress measurements in Kg/cm 2 on seven types of natural rubber vulcanizates from 11 laboratories.
The "C. jejuni" data measures the fraction of bacteria found on disease-resistant turkeys (Qiu, 2013).These data were collected over a five year period across four turkey plants in North Carolina.These data are displayed in Figures 1 and 3.
The "Wheat" data (Graybill, 1954) measures wheat yields in bushels per acre in four varieties in 13 locations.These data are plotted in Figure 4.
The eight Copy number variation data sets "CNV1-CNV8" compare discrepancies in the copy number signal between normal and tumor tissue samples from a comparative genomic hybridization array.Each data set corresponds to a separate location in the genome that is tested with the assay.Six dogs each had two tissue samples (one normal and one tumor) upon which the assay was conducted.The eight specific sets were selected from 5899 sets that were analyzed in a previous study and because they showed evidence of hidden additivity (Franck et al., 2013).The hidden additivity exhibited in these copy number responses might suggest the existence of multiple tumor sub-types for lymphoma in The R Journal Vol.8/1, Aug. 2016 ISSN 2073-4859 dogs.The full data for this example may be accessed here: http://www.sciencedirect.com/science/article/pii/S0167947313001618.
The results of non-additivity tests using methods supported by the hiddenf package are reported in Table 1.The data come from 17 sources, and there are 24 total sets with the copy number variation contributing eight individual sets.This collection is not intended as a representative sample of the larger class of all unreplicated two-factor data, since many sets were included due to their apparent non-additivity.This exercise is intended to show the breadth of applications for which the study of non-additivity is important.To facilitate discussion, the standard p-value less than α = 0.05 criterion is used to declare statistical significance without further multiplicity correction, although the reader is invited to interpret the raw p-values however they like.The ACMIF test for hidden additivity was significant for 16 of 24 sets when grouping levels of factor R. The test was significant for eight of 16 sets when grouping levels of factor C. The Tukey test was significant for eight of 24 sets.The rows-linear and columns-linear tests showed significant non-additivity for seven and six sets (out of 16 possible), respectively.The KKSA test grouping levels of R and C revealed non-additivity for 13 of 22 and five of 13 sets, respectively.The Malik approach yielded significant non-additivity for nine of 24 sets.Since all tests assume different restrictions on the form of interaction, no single test is optimal for every conceivable pattern.Hence, differential performance among methods is expected for different types of data.Each test excels at detecting different patterns of non-additivity.

Summary
This paper describes the main functionality of the hiddenf package.hiddenf provides descriptive and inferential tools to visualize and characterize hidden additivity.Further, hiddenf includes the ability to compute p-values for five tests for non-additivity.The package is illustrated using seventeen data sets spanning studies in industrial applications, agriculture, and the medical sciences.Non-additivity is evident in many of these data sets, motivating the importance of statistical interaction in unreplicated studies across a variety of scientific domains.The hiddenf package contributes to existing software resources specifically by making three recent tests for non-additivity available in an open-source repository for the first time (in addition to two historical methods) and by providing visualization and descriptive tools to further explore hidden additivity.
The R Journal Vol.

Figure 1 :
Figure1: Interaction plots.Lines correspond to levels of factor R and tick marks on the horizontal axis correspond to levels of factor C. Panel A is simulated data from (1).These data do not exhibit statistical interaction so departure from parallel lines is due to random noise.Panel B is cjejuni.mtxdata, which visually exhibit non-additivity.

Figure 2 :
Figure 2: Flowchart of hiddenf functionality.The user supplies a r × c data matrix to the HiddenF() function, which returns an object of the HiddenF class.The user can then characterize hidden additivity and obtain p-values for tests of non-additivity.
function considers the row factor as the

Table Response
function can be used to produce an ANOVA table for this test.Since the p-value is large, the output below does not provide statistical evidence for non-additivity of the form suggested in (3).

Table 1 :
P-values for various tests of non-additivity supported by hiddenf.‡: r and/or c insufficient for analysis.: grouping on c > 20 for Penicillin not currently available in hiddenf.The Malik p-values employ N=100,000 Monte Carlo replicates.