The R Package trafo for Transforming Linear Regression Models

Researchers and data-analysts often use the linear regression model for descriptive, predictive, and inferential purposes. This model relies on a set of assumptions that, when not satisfied, yields biased results and noisy estimates. A common problem that can be solved in many ways – use of less restrictive methods (e.g. generalized linear regression models or non-parametric methods ), variance corrections or transformations of the response variable just to name a few. We focus on the latter option as it allows to keep using the simple and well-known linear regression model. The list of transformations proposed in the literature is long and varies according to the problem they aim to solve. Such diversity can leave analysts lost and confused. We provide a framework implemented as an R-package, trafo, to help selecting a suitable transformation depending on the user and data needs. The package trafo contains a collection of selected transformations and estimation methods that complement and enlarge the methods that exist in R so far.


Introduction
To study the relation between two or more variables, the linear regression model is one of the most employed statistical methods. For an appropriate usage of this model, a set of assumptions needs to be fulfilled. These assumptions are, among others, related to the functional form and to the error terms, such as linearity and homoscedasticity. However, in practical applications, these assumptions are not always satisfied. This leads to the question of how to move on with the analysis in such cases. One way to proceed is to conduct the analysis ignoring the model assumption violations which is, of course, not recommended as it would likely yield misleading results. An alternative solution is to use more complex methods such as generalized linear regression models or non-parametric methods, as they might fit the data and problem better. A third method-and the focus of the present workis the application of suitable transformations. Throughout the current manuscript, we use the term transformations to refer to the application of monotonic functions to the response variable of a linear regression model. For more flexible transformation functions, please refer to (e.g.) Hothorn et al. (2018).
Transformations have the potential to correct certain violations of model assumptions and by doing so, allow an analysis to continue with the linear regression model. Due to its convenience, transformations such as the logarithm or the Box-Cox (Box and Cox, 1964) are commonly applied in many branches of sciences; for example in economics (Hossain, 2011) and neuroscience (Morozova et al., 2016). In order to simplify the choice and the usage of transformations in the linear regression model, the R package trafo (Medina et al., 2018) is developed. The present work is inspired by the framework proposed in Rojas-Perilla (2018, pp. 9-45) and extends other existing R packages that provide transformations.
Many packages that contain transformations do not specifically focus on the use of transformations (Venables and Ripley, 2002;Fox and Weisberg, 2011;Molina and Marhuenda, 2015;Ribeiro Jr. and Diggle, 2016). They often only include widely used transformations like the logarithmic or the Box-Cox transformation family. The package car (Fox and Weisberg, 2011) expands the selection of transformations; it includes the Box-Cox, the Tukey (Tukey, 1957), and the Yeo-Johnson (Yeo and Johnson, 2000) transformation families, and uses the maximum likelihood approach for the estimation of the transformation parameter (Box and Cox, 1964). The package rcompanio (Mangiafico, 2019) focuses on the Tukey transformation with estimation via goodness of fit tests. In addition to the logarithm and Box-Cox, the package bestNormalize (Peterson, 2019) also includes the glog (see e.g. Durbin et al., 2002) and Yeo-Johnson transformations. An exponential transformation proposed by Manly (1976) is provided in the package caret (Kuhn, 2008) and the multiple parameter Johnson transformation (Johnson, 1949) in the packages Johnson (Fernandez, 2014) and jtrans (Wang, 2015). While the packages MASS (Venables and Ripley, 2002) and car (Fox and Weisberg, 2011) solely provide the maximum likelihood approach for the estimation of the transformation parameter for the Box-Cox family, the package AID (Dag et al., 2017) includes a wide range of methods, mostly based on goodness of fit tests like the Shapiro-Wilk or the Anderson-Darling test. Though the use of these methods is limited to the Box-Cox transformation. For a summary of the various transformations available in R packages, please see Table 1.
It is noticeable that most of the above-mentioned packages do not help the user in the process of deciding which transformation is actually suitable according to the users needs. Furthermore, most packages do not provide tools to "eyeball" whether the employed transformation improves the data with regard to fulfilling the model assumptions. Package trafo combines and extends the features provided by the packages mentioned above. Additionally to transformations that are already provided by existing packages, the trafo package includes, among others, the Bickel-Doksum (Bickel and Doksum, 1981), modulus (John and Draper, 1980), the neglog (Whittaker et al., 2005) and glog (see e.g. Durbin et al., 2002) transformations that are modifications of the Box-Cox and the logarithmic transformation in order to deal with negative values in the response variable. The selection of estimation methods for the transformation parameter is enlarged by methods based on moments and divergence measures (see e.g. Taylor, 1985;Yeo and Johnson, 2000;Royston et al., 2011). The main benefits of the package trafo can be summarized as follows: • An initial check can be conducted that helps to decide if and which transformation is useful for the researchers' needs.
• The untransformed model and a model with a transformed dependent variable can be easily compared under the light of the model assumptions (more on this below). Alternatively, two transformed models can be run and compared simultaneously • Extensive diagnostics are provided in order to check if the transformation helps to fulfill the model assumptions normality, homoscedasticity, and linearity.

Transformations and estimation methods
The equation describing and summarizing the relationship between a continuous outcome variable y and different covariates x (either categorical or continuous) is defined by y i = x T i β + e i , with i = 1, . . . , n. This is also known as the linear regression model and is composed by a deterministic and a random component, which rely on different assumptions. Among others, these assumptions can be summarized as follows: • Normality (N): The conditional distribution of y given x follows a normal distribution. This is an optional, but often desired assumption (e.g. Box and Cox (1964)).
• Homoscedasticity (H): The conditional variance of y given x is constant.
• Linearity (L): The conditional expectation of the outcome variable y given the continuous covariates x is a linear function in x.
As already mentioned, different approaches have been proposed to overcome the violations of these model assumptions. Some of them include alternative estimation methods of the regression terms or more complex regression models (see e.g. Nelder and Wedderburn, 1972;Berry, 1993). In the present manuscript, we focus on defining a parsimonious modification for the model, such as the usage of non-linear transformations of the outcome variable. The transformations implemented in the package trafo particularly help to achieve normality. However, most of them simultaneously correct other assumptions (see also Table 2 and Table 3). We classify transformations in two groups: non-parametric transformations and data-driven transformations with a transformation parameter that needs to be estimated. The first set of transformations presented in Table 2 comprises, among others, the logarithmic transformation, which is considered due to its popularity and straightforward application. The data-driven transformations presented in  Box and Cox (1964) log(y + s) y ∈ {−s; ∞} Glog Rocke and Durbin (2001) log(y + y 2 + 1) y ∈ R Durbin et al. (2002) Huber et al. (2002, 2003 Neglog Whittaker et al. (2005) Sign(y) log(|y| + 1) y ∈ R Reciprocal Tukey (1957) 1 y y = 0 Table 3 are dominated by the Box-Cox transformation and its modifications or alternatives, e.g. the modulus or Bickel-Doksum transformation. More flexible versions of the logarithmic transformation, as the log-shift opt, or the Manly transformation, an exponential transformation, are also included in the package trafo.  Bickel and Doksum (1981) Medina et al. (2018) y + λ y ∈ R Manly (1976) e λy −1 λ

Modulus
John and Draper (1980) Sign(y) y ∈ R Table 2 and 3 provide information about the range y that is supported by the transformation. Some transformations are only suitable for positive values of y. This is generally true for the logarithmic and Box-Cox transformations. However, in case that the dependent variable contains negative values, the values are shifted by a deterministic shift s such that y + s > 0 by default in package trafo. Furthermore, the tables highlights which assumptions the transformation helps to achieve. Kindly note that we are proposing general suggestions and the benefits of transformations depend on the data. For specific properties of each transformation we refer to the further references. The square root shift transformation with a data-driven shift in analogy to the log-shift opt transformation is, to the best of our knowledge, firstly implemented in this work. In contrast, a square root transformation with deterministic shift, for example, is suggested in Bartlett (1947).
Below, we summarize the collection of methods included in trafo to estimate the parameters of the transformations presented in Table 3. The benefit of each estimation method depends on the research analysis and the underlying data.
• Maximum likelihood theory (Box and Cox, 1964) • Distribution moments optimization: Skewness or kurtosis (Carroll and Ruppert, 1987;Royston et al., 2011;Rojas-Perilla, 2018) • Divergence minimization: Following Kolmogorov-Smirnov (KS), Cramér-von-Mises (KM) or Kullback-Leibler (KL) measurements (Cramér, 1928;Kolmogorov, 1933;Smirnov, 1939;Kullback and Leibler, 1951;Hernandez and Johnson, 1980;Rojas-Perilla, 2018)  The maximum likelihood estimation method finds the set of values for the transformation parameter that maximizes the likelihood function of the dataset under the selected transformation (Box and Cox, 1964). This is a standard approach that is also implemented in several of the mentioned R packages (Venables and Ripley, 2002;Fox and Weisberg, 2011). However, since the maximum likelihood estimation is rather sensitive to outliers, the skewness or kurtosis optimization might be preferable for the estimation of the transformation parameter in the presence of outliers (see e.g. Royston et al., 2011). The use of kurtosis over skewness optimization depends entirely on the shape of the distribution of the data and the goal of the analyst -skewness optimizations corrects for asymmetry and kurtosis for light or heavy tails. Additionally, if the focus lies on comparing the whole distribution of the transformed data with a normal distribution, and not only on some moments, different divergence measures as the KS, KM or KL can be used (see e.g. Yeo and Johnson, 2000). For all estimation methods, a range on which the functions are evaluated needs to be proposed. Therefore, default values are set for the predefined transformations. For more information about different estimation methods we refer to Rojas-Perilla (2018, pp. 9-45).
Since the user can only decide if the transformation is helpful by checking the above mentioned assumptions, the package trafo contains a wide range of diagnostic checks (e.g. Anderson and Darling, 1954;Shapiro and Wilk, 1965;Breusch and Pagan, 1979). A smaller selection is used in the fast check that helps to decide if a transformation might be useful. Table 4 summarizes the implemented diagnostic checks that are simultaneously returned for the untransformed and a transformed model or two differently transformed models and indicates which diagnostics are conducted in the fast check. Additionally, plots are provided that help to detect outliers such as the Cook's distance plot and influential observations by the residuals vs leverage plot.
Another feature of the package trafo is the possibility of defining a customized transformation. Thus, a user can also use the infrastructure of the package for a transformation that suits the individuals needs better than the predefined transformations. However, in this version of the package trafo the user needs to define the transformation and the standardized transformation in order to use this feature. For the derivation of the standardized transformation of all predefined transformations, see the Appendix.

Applications
The usage of transformations in practice may help to meet model assumptions but it can also lead to complexities as the interpretation of parameters and standard errors in inference or backtransformation biases in prediction (Rojas-Perilla, 2018). For instance, it is questionable how to address the estimation of the transformation parameter in inference. Box and Cox (1964) point out that after applying the Box-Cox transformation to the outcome variable, the transformation parameter should be treated as fixed and known and the subsequent analysis could be done in the transformed scale. However, Bickel and Doksum (1981) emphasize that estimating a transformation parameter in a model could overestimate the parameters' variance yielding conservative confidence intervals. In prediction, on the other hand, lost interpretability of parameters and standard errors may be less important but the back-transformation could lead to a bias neglecting the non-linearity of the transformation (see e.g. Mosimann et al., 2018).
Nevertheless, several studies show how transformations can be useful in applications. Pek et al. (2017) demonstrate how the log transformation can be used for describing the relation between earnings and years of experience and the reciprocal transformation for the effect of intelligence quotient (IQ) on performance on mental sum problems. The logarithm and the Box-Cox transformation are often applied in econometric research, e.g. to describe monetary policies (Zarembka, 1968(Zarembka, , 1974. Transformations have also been used to improve the functional form in studies of demand functions for meat (shyong Chang, 1977), travel costs (Vaughan et al., 1982), and recreation (Ziemer et al., 1980) in the U.S and for import equations in the Republic of Ireland (Boylan et al., 1982). Another research field for the application of transformations is genetics (Huber et al., 2003). The data sets often exhibit a diagnostics() Returns information about the transformation and different diagnostics checks in form of tests. plot() Returns graphical diagnostics checks.
high variability and non-normality problems. To address this, the glog and gpower can be useful in practice (Durbin et al., 2002;Kelmansky et al., 2013).
When using package trafo for applications, it should be noted that the package focuses on finding a suitable transformation with regards to fulfilling specific model assumptions, the user still has to decide if the transformation is reasonable in a specific application. The following section shows which functionalities the package provides for the user.

Case study
In order to show the functionality of the package trafo, we present -in form of a case study -the steps a user faces when checking the assumptions of the linear model. For this illustration, we use the data set called University from the R package Ecdat (Croissant, 2016). This data set contains variables measuring the equipment and costs of university teaching and research. These data can be made available as follows:

R> library(Ecdat) R> data(University)
A practical question for the head of a university could be how study fees (stfees) raise the universities net assets (nassets). Both variables are metric. Thus, a linear regression could help to explain the relation between these two variables. A linear regression model can be conducted in R using the lm function.

R> linMod <-lm(nassets~stfees, data = University)
The features in the package trafo that help to find a suitable transformation for this model and to compare different models are summarized in Table 5 and illustrated in the next subsections.

Finding a suitable transformation
It is well known that the reliability of the linear regression model depends on the assumptions presented above. In this section, we focus on presenting how the user can decide and assess which (and whether) transformations help to fulfill these model assumptions. A first fast check of these model assumptions can be used in the package trafo in order to find out if the untransformed model meets these assumptions or if using a transformation seems suitable. The fast check can be conducted by the function assumptions. This function returns the skewness, the kurtosis and the Shapiro-Wilk/Anderson-Darling test for normality, the Breusch-Pagan test for homoscedasticity and scatter plots between the dependent and the explanatory variables for checking the linear relation. All possible arguments of the function assumptions are summarized in Table 6. In the following, we only show the returned normality and homoscedasticity tests. The results are ordered by the p value of the Shapiro-Wilk and Breusch-Pagan test.

R> assumptions(linMod)
The default lambdarange for the log shift opt transformation is calculated dependent on the data range. The lower value is set to -2035.751 and the upper value to 404527.249 The default lambdarange for the square root shift transformation is calculated dependent on the data range. The lower value is set to -2035.751 and the upper value to 404527.249 The R Journal Vol. 11/2, December 2019 ISSN 2073-4859

Comparing the untransformed model with a transformed model
For a more detailed comparison of the transformed model with the untransformed model, a function called trafo_lm (for the arguments see Table 7) can be used as follows:

R> linMod_trafo <-trafo_lm(linMod)
The Box-Cox transformation is the default option such that only the lm object needs to be given to the function. The object linMod_trafo is of class trafo_lm and the user can conduct the methods print, summary and plot in the same way as for an object of class lm. The difference is that the new methods simultaneously return the results for both models, the untransformed model and the transformed model. Furthermore, a method called diagnostics helps to compare results of normality and homoscedasticity tests. In the following, we will show the return of the diagnostics method and some selected plots in order to check the normality, homoscedasticity, and the linearity assumption of the linear regression model.

R> diagnostics(linMod_trafo)
Diagnostics: Untransformed vs transformed model The first part of the output shows information of the applied transformation. As chosen, the Box-Cox transformation is used with the optimal transformation parameter around 0.19 which is estimated using the maximum likelihood approach that is also set as default. The optimal transformation parameter differs from 0, which would be equal to the logarithmic transformation, and 1, which means that no transformation is optimal. The Shapiro-Wilk test rejects normality of the residuals of the untransformed model but it does not reject normality for the residuals of the transformed model on a 5% level of significance. Furthermore, the skewness shows that the residuals in the transformed model are more symmetric and the kurtosis is closer to 3, the value of the kurtosis of the normal distribution. The results of the Breusch-Pagan test clearly show that homoscedasticity is rejected in the untransformed model but not in the transformed model. These two findings can be supported by diagnostic plots shown in Figure 1.

R> plot(linMod_trafo)
In order to evaluate the linearity assumption, scatter plots of the dependent variable against the explanatory variable can help. Figure 2 shows that the assumption of linearity is violated in the untransformed model. The upper panel shows the Pearson correlation coefficient. In contrast, the relation between the transformed net assets and the study fees seems to be linear. As shown above, the user can obtain diagnostics for an untransformed and a transformed model with only a little more effort in comparison to fitting the standard linear regression model without transformation. While we only show the example with the default transformation, the user can also easily change the transformation and the estimation method. For instance, the user could choose the log-shift opt transformation with the skewness minimization as estimation method.

Compare two transformed models
The user can also compare different transformations within the frame of the model assumptions. Oftentimes the logarithm is blindly used without much consideration about its usefulness. In order to   Figure 2: Selection of obtained diagnostic plots by using plot(linMod_trafo). (a) shows the scatter plot of the untransformed net assets and the study fees (b) shows scatter plot of the transformed net assets and the study fees. The numbers specify the correlation coefficient between the dependent and independent variable.  Second, the objects can be used to compare linear models with differently transformed dependent variable using function trafo_compare. The arguments of this functions are shown in Table 8. The user creates an object of class trafo_compare by: R> linMod_comp <-trafo_compare(object = linMod, + trafos = list(boxcox_uni, log_uni)) For this object, the user can use the same methods as for an object of class trafo_lm. In this work, we only want to show the return of method diagnostics.

R> diagnostics(linMod_comp)
Diagnostics of two transformed models The first part of the return points out that the Box-Cox transformation is a data-driven transformation with a transformation parameter while the logarithmic transformation does not adapt to the data. Furthermore, we can see that normality is rejected for the model with a logarithmic transformed dependent variable, while it is not rejected when the Box-Cox transformation is used. The violation of the homoscedasticity assumption can be fixed by both transformations.

Additional features Extract the transformed model and vector
The trafo package provides focused but limited methods to analyze the model. However, the transformed model can be easily extracted from the trafo_lm object.

R> class(linMod_trafo$trafo_mod) [1] "lm"
The extracted object is of class lm such that all available methods for "lm" objects can also be used for the extracted object.
Similarly, it is possible to get the transformed vector.

Customized transformation
As summarized in the introduction, many R packages, including package trafo, provide a large number of transformations. Naturally, we do not include the comprehensive list of available transformations as this would be a too ambitious task, though we do acknowledge that depending on the needs of the user, a non-implemented transformation might be of interest (for the wide range of possible transformations, see e.g. Rojas-Perilla, 2018). Motivated by this, we include the option to employ our framework-e.g. the estimation of the transformation parameter -with transformations not provided in our package. In the following lines, we show the application of this future using the Tukey transformation (Tukey, 1957).
In a first step, the transformation and the standardized or scaled transformation need to be defined.
R> linMod_custom <-trafo_lm(linMod, trafo = "custom_one", + lambdarange = c(0, 2), custom_trafo = list(tukey, tukey_std)) The R Journal Vol. 11/2, December 2019 ISSN 2073-4859 One limitation of this feature is the necessity to insert both the transformation and the scaled transformation since the latter is often not known. Furthermore, the framework is only suitable for transformations without and with one transformation parameter.

Conclusions and future developments
Although transformations were developed in the absence of efficient machines as an alternative to high memory-consumming methods, they are still a parsimonious way to meet model assumptions for linear regression model. We showed how the package trafo helps the user to easily decide whether and which transformations are suitable to fulfill normality, homoscedasticity, and linearity. To the best of our knowledge trafo is the only R package that supports this decision process. Furthermore, the package trafo provides an extensive collection of transformations usable in linear regression models and a wide range of estimation methods for the transformation parameter. In future versions, we plan to enlarge this collection as well as providing similar functionality for other types of data, e.g. count data. Additionally, more methods that are available for the class lm could be developed for objects of class trafo_lm. We would also like to expand the infrastructure for linear mixed regression models.

Acknowledgment
We thank Prof. Dr. Timo Schmid for fruitful discussions.

Appendix: Likelihood derivation of the transformations Log (shift) transformation
Let J(y) denote the Jacobian of a transformation from y i to y * i . In order to obtain z * i , the scaled log (shift) transformation, given by y * i J(y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ LS . Therefore, the Jacobian, the scaled, and the inverse of the log (shift) transformation are given below.
The log (shift) transformation presented in Table 2 is defined as: In case, the fixed shift parameter s would not be necessary, the standard logarithm function (logarithmic transformation with s = 0) is applied.
The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian is defined as: The scaled transformation is given by: The inverse function of the log (shift) transformation is denoted as:

Glog transformation
Let J (y) denote the Jacobian of a transformation from y i to y * i . In order to obtain z * i , the scaled glog transformation, given by y * i J(y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ GL . Therefore, the Jacobian, the scaled, and the inverse of the glog transformation are given below.
The glog transformation presented in Table 2 is defined as: The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian is defined as: The scaled transformation is given by: The inverse function of the glog transformation is denoted as:

Neglog transformation
Let J(y) denote the Jacobian of a transformation from y i to y * i . In order to obtain z * i , the scaled neglog transformation, given by y * i J(y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ NL . Therefore, the Jacobian, the scaled, and the inverse of the neglog transformation are given below.
The neglog transformation presented in Table 2 is defined as: The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian comes to: The scaled transformation is given by: The inverse function of the neglog transformation is denoted as:

Reciprocal transformation
Let J(y) denote the Jacobian of a transformation from y i to y * i . In order to obtain z * i , the scaled reciprocal transformation, given by y * i J(y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ R . Therefore, the Jacobian, the scaled, and the inverse of the reciprocal transformation are given below.
The reciprocal transformation presented in Table 2 is defined as: The definition of the geometric mean is:ȳ Therefore, the expression of the Jacobian is defined as: The R Journal Vol. 11/2, December 2019 ISSN 2073-4859 The scaled transformation is given by: The inverse function of the reciprocal transformation is denoted as:

Box-Cox (shift) transformation case (A)
Let J(λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled Box-Cox (shift)(A) transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ BC . Therefore, the Jacobian, the scaled, and the inverse of the Box-Cox (shift)(A) transformation are given below. Table 3 is defined as:

The Box-Cox (shift)(A) transformation presented in
In case, the fixed shift parameter s is not necessary for making the dataset positive, the standard Box-Cox transformation (with s = 0) is applied. The definition of the geometric mean is: Therefore, the expression of the Jacobian comes to: The scaled transformation is given by: The R Journal Vol. 11/2, December 2019 ISSN 2073-4859 The inverse function of the Box-Cox (shift)(A) transformation is denoted as:

Box-Cox (shift) transformation case (B)
This case is exactly equal to the log (shift) case.

Log-shift opt transformation
Let J(λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled log-shift opt transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ LSO . Therefore, the Jacobian, the scaled, and the inverse of the log-shift opt transformation are given below.
The log-shift opt transformation presented in Table 3 is defined as: The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian is defined as: The scaled transformation is given by: The inverse function of the log-shift opt transformation is denoted as:

Bickel-Docksum transformation
Let J(λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled Bickel-Docksum transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ BD . Therefore, the Jacobian, the scaled, and the inverse of the Bickel-Docksum transformation are given below.
The Bickel-Docksum transformation presented in Table 3 is defined as: The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the jacobian comes to: The scaled transformation is given by: The inverse function of the Bickel-Docksum transformation is denoted as:

Yeo-Johnson transformation case (A)
This case is exactly equal to the Box-Cox (shift) case (A), with s = 1.

Yeo-Johnson transformation case (B)
This case is exactly equal to the log (shift) case, with s = 1.

Yeo-Johnson transformation case (C)
Let J (λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled Yeo-Johnson(C) transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ YC . Therefore, the Jacobian, the scaled, and the inverse of the Yeo-Johnson(C) transformation are given below. Table 3 is defined as:

The Yeo-Johnson(C) transformation presented in
The R Journal Vol. 11/2, December 2019 ISSN 2073-4859 The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian comes to: The scaled transformation is given by: The inverse function of the Yeo-Johnson(C) transformation is denoted as:

Yeo-Johnson transformation case (D)
Let J (y) denote the Jacobian of a transformation from y i to y * i . In order to obtain z * i , the scaled Yeo-Johnson(D) transformation, given by y * i J(y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ YD . Therefore, the Jacobian, the scaled, and the inverse of the Yeo-Johnson(D) transformation are given below. Table 3 is defined as:

The Yeo-Johnson(D) transformation presented in
The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian is defined as: The scaled transformation is given by: The inverse function of the Yeo-Johnson(D) transformation is denoted as:

Square root-shift opt transformation
Let J (λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i , the scaled square root-shift opt transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ SR . Therefore, the Jacobian, the scaled, and the inverse of the square root-shift opt transformation are given below.
The square root-shift opt transformation presented in Table 3 is defined as: The definition of the geometric mean is:ȳ Therefore, the expression of the Jacobian is defined as: The scaled transformation is given by: The inverse function of the square root-shift opt transformation is denoted as:

Manly transformation case (A)
Let J (λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled Manly(A) transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ M . Therefore, the Jacobian, the scaled, and the inverse of the Manly(A) transformation are given below. Table 3 is defined as:

The Manly(A) transformation presented in
The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian comes to: The scaled transformation is given by: The inverse function of the Manly(A) transformation is denoted as:

Manly transformation case (B)
The variable remains equal, y * i = y i .

Modulus transformation case (A)
Let J (λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled modulos(A) transformation, given by J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ MA . Therefore, the Jacobian, the scaled, and the inverse of the modulus(A) transformation are given below. Table 3 is defined as:

Modulus transformation case (B)
This case is exactly equal to the neglog transformation case.

Dual power transformation case (A)
Let J (λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled dual power(A) transformation, given by y * i (λ) J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ DA . Therefore, the Jacobian, the scaled, and the inverse of the dual power(A) transformation are given below. The dual power(A) transformation presented in Table 3 is defined as: The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian comes to: The scaled transformation is given by: The inverse function of the dual power(A) transformation is found by solving the quadratic by completing the square as: 2λy * i y λ i = y 2λ i − 1 1 + λ 2 y * 2 i = y 2λ i − 2λy * i y λ i + λ 2 y * 2 i 1 + λ 2 y * 2 i = (y λ i − λy * i ) 2 1 + λ 2 y * 2 i + λy * i = y λ i y i = 1 + λ 2 y * 2 i + λy * i 1 λ ⇒ f −1 (y * i ) = 1 + λ 2 y * 2 i + λy * i 1 λ .

Dual power transformation case (B)
This case is exactly equal to the Box-Cox (shift) transformation, case (B).

Gpower transformation case (A)
Let J (λ, y) denote the Jacobian of a transformation from y i to y * i (λ). In order to obtain z * i (λ), the scaled gpower(A) transformation, given by y * i (λ) J(λ,y) 1/n , and for simplicity, we use a modification of the definition of the geometric mean, denoted byȳ GA . Therefore, the Jacobian, the scaled, and the inverse of the gpower(A) transformation are given below.
The gpower(A) transformation presented in Table 3 is defined as: The modification of the definition of the geometric mean for this transformation is: Therefore, the expression of the Jacobian comes to: The scaled transformation is given by: The inverse function of the gpower(A) transformation is denoted as:

Gpower transformation case (B)
This case is exactly equal to the glog transformation case.