The assumption of elliptical symmetry has an important role in many theoretical developments and applications. Hence, it is of primary importance to be able to test whether that assumption actually holds true or not. Various tests have been proposed in the literature for this problem. To the best of our knowledge, none of them has been implemented in R. This article describes the R package ellipticalsymmetry which implements several well-known tests for elliptical symmetry together with some recent tests. We demonstrate the testing procedures with a real data example.
Let
Given the omnipresence of the assumption of elliptical symmetry, it is essential to be able to test whether that assumption actually holds true or not for the data at hand. Numerous tests have been proposed in the literature, including Beran (1979), Baringhaus (1991), Koltchinskii and Sakhanenko (2000), Manzotti et al. (2002), Schott (2002), Huffer and Park (2007), Cassart (2007), and Babić et al. (2021). Tests for elliptical symmetry based on Monte Carlo simulations can be found in Diks and Tong (1999) and Zhu and Neuhaus (2000); Li et al. (1997) recur to graphical methods, and Zhu and Neuhaus (2004) build conditional tests. We refer the reader to Serfling (2006) and Sakhanenko (2008) for extensive reviews and performance comparisons. To the best of our knowledge, none of these tests is available in the open software R. The focus of this paper is to close this gap by implementing several well-known tests for elliptical symmetry together with some recent tests. The test of (Beran 1979) is neither distribution-free nor affine-invariant. Moreover, there are no practical guidelines to the choice of the basis functions involved in the test statistic. Therefore, we opt not to include it in the package. Baringhaus (1991) proposes a Cramér-von Mises type test for spherical symmetry based on the independence between norm and direction. Dyckerhoff et al. (2015) have shown by simulations that this test can be used as a test for elliptical symmetry in dimension 2. This test assumes the location parameter to be known and its asymptotic distribution is not simple to use (plus no proven validity in dimensions higher than 2). Hence, we decided not to include it in the package. Thus, the tests suggested by Koltchinskii and Sakhanenko (2000), Manzotti et al. (2002), Schott (2002), Huffer and Park (2007), Cassart (2007), and Babić et al. (2021) are implemented in the package ellipticalsymmetry.
This paper describes the tests for elliptical symmetry that have been implemented in the ellipticalsymmetry package, together with a detailed description of the functions that are available in the package. The use of the implemented functions is illustrated using financial data.
In this section, we focus on the tests for elliptical symmetry that have been implemented in our new ellipticalsymmetry package. Besides formal definitions of test statistics and limiting distributions, we also explain the details of computation.
(Koltchinskii and Sakhanenko 2000) develop a class of omnibus bootstrap tests
for unspecified location that are affine invariant and consistent
against any fixed alternative. The estimators of the unknown parameters
are as follows:
In dimension
In our R package, this test can be run using a function called
KoltchinskiiSakhanenko()
. The syntax for this function is very simple:
KoltchinskiiSakhanenko(X, R=1000, nJobs = -1),
where X
is an input to this function consisting of a data set which
must be a matrix, and R
stands for the number of bootstrap replicates.
The default number of replicates is set to nJobs
argument
represents the number of CPU cores to use for the calculation. This is a
purely technical option which is used to speed up the computation of
bootstrap-based tests. The default value -1
indicates that all cores
except one are used.
Manzotti et al. (2002) develop a test based on spherical harmonics. The
estimators of the unknown parameters are the sample mean denoted as
MPQ()
function with the following syntax
MPQ(X, epsilon = 0.05).
As before, X
is a numeric matrix that represents the data while
epsilon
is an option that allows the user to indicate the proportion
of points epsilon
is set to
0.05.
Schott (2002) develops a Wald-type test for elliptical symmetry based on the analysis of covariance matrices. The test compares the sample fourth moments with the expected theoretical ones under ellipticity. Given that the test statistic involves consistent estimates of the covariance matrix of the sample fourth moments, the existence of eight-order moments is required. Furthermore, the test has a very low power against several alternatives. The final test statistic is of a simple form, even though it requires lengthy notations.
For an elliptical distribution with mean
Moreover, let
The Schott test can be performed in our package by using the function
Schott()
with the very simple syntax Schott(X)
, where X
is a
numeric matrix of data values.
Huffer and Park (2007) propose a Pearson chi-square type test with
multi-dimensional cells. Under the null hypothesis of ellipticity, the
cells have asymptotically equal expected cell countsm, and after
determining the observed cell counts, the test statistic is easily
computed. Let
In the R package, we are considering three particular ways to partition
the space: using (i) the HufferPark()
. The
syntax, including all options, for the function HufferPark()
is, for
instance,
HufferPark(X, c, R = NA, sector = "orthants", g = NA, nJobs = -1).
We will now provide a detailed description of its arguments. X
is an
input to this function consisting of a data set. sector
is an option
that allows the user to specify the type of sectors used to divide the
space. Currently supported options are "orthants"
, "permutations"
,
and "bivariateangles"
. The last one being available only in dimension
g
argument indicates the number of sectors. The user has to
choose g
only if sector = "bivariateangles"
and it denotes the
number of regions used to divide the plane. In this case, regions
consist of points whose angle in polar coordinates is between
sector
is
set to "orthants"
, then g
is fixed and equal to sector = "permutations"
, g
is c
. The value of c
should be
such that the average cell counts nJobs
represents the
number of CPU cores to use for the calculation. The default value -1
indicates that all cores except one are used.
The asymptotic distribution is available only under
sector = "orthants"
when the underlying distribution is close to
normal. It is a linear combination of chi-squared random variables, and
it depends on eigenvalues of congruent sectors used to divide the space
R
. Note that by default, sector
is set to "orthants"
and R = NA
.
Cassart (2007) and (Cassart et al. 2008) construct Pseudo-Gaussian tests for specified and unspecified location that are most efficient against a multivariate form of Fechner-type asymmetry (defined in Cassart (2007), Chapter 3).
These tests are based on Le Cam’s asymptotic theory of statistical
experiments. We start by describing the specified-location
Pseudo-Gaussian test. The unknown parameter
In most cases, the assumption of a specified center is, however,
unrealistic. Cassart (2007), therefore, proposes also a test for the
scenario when the location is not specified. The estimator of the
unknown
This test can be run in our package by calling the function
pseudoGaussian()
with the simple syntax
pseudoGaussian(X, location = NA).
Besides X
, which is a numeric matrix of data values, now we have an
extra argument location
, which allows the user to specify the known
location. The default is set to NA
which means that the unspecified
location test will be performed unless the user specifies location.
Recently, Babić et al. (2021) proposed a new test for elliptical symmetry both for specified and unspecified location. These tests are based on Le Cam’s asymptotic theory of statistical experiments and are optimal against generalized skew-elliptical alternatives (defined in Section 2 of said paper), but they remain quite powerful under a much broader class of non-elliptical distributions.
The test statistic for the specified location scenario has a very simple
form and an asymptotic chi-square distribution. The test rejects the
null hypothesis whenever
When the location is not specified, Babić et al. (2021) propose tests that have a
simple asymptotic chi-squared distribution under the null hypothesis of
ellipticity, are affine-invariant, computationally fast, have a simple
and intuitive form, only require finite moments of order 2, and offer
much flexibility in the choice of the radial density
From this general expression, one can readily derive the test statistics
for specific choices of
These tests can be run in R using a function called SkewOptimal()
with
the syntax
SkewOptimal(X, location = NA, f = "t", param = NA).
Depending on the type of the test, some of the input arguments are not
required. X
and location
are the only input arguments for the
specified location test and have the same role as for the
Pseudo-Gaussian test. As before, the default value for location
is set
to NA
, which implies that the unspecified location test will be
performed unless the user specifies the location. For the unspecified
location test, besides the data matrix X
, the input arguments are f
and param
. The f
argument is a string that specifies the type of the
radial density based on which the test is built. Currently supported
options are "t"
, "logistic"
, and "powerExp"
. Note that the default
is set to "t"
. The role of the param
argument is as follows. If
f = "t"
, then param
denotes the degrees of freedom of the
multivariate "t"
, it follows that the default value of param
represents the
degrees of freedom of the multivariate f = "powerExp"
, then param
denotes the kurtosis parameter
param
has to be different from
We conclude the description of tests for elliptical symmetry by
comparing their time complexity in terms of the big O notation
(Cormen et al. 2009). More concretely, we are comparing the number
of simple operations that are required to evaluate the test statistics
and the
The test of Koltchinskii and Sakhanenko is computationally more
demanding than the bootstrap version of the test of Huffer and Park.
Among unspecified location tests that do not require bootstrap
procedures, the most computationally expensive test is the MPQ test
under the realistic assumption that
We have seen that several tests require bootstrap procedures and
therefore are, by default, computationally demanding. Such tests require
the calculation of the statistic on the resampled data
statistics | ||
---|---|---|
KoltchinskiiSakhanenko | ||
MPQ | ||
Schott | ||
HufferPark | ||
HufferPark (bootstrap) | ||
PseudoGaussian (specified location) | ||
PseudoGaussian | ||
SkewOptimal (specified location) | ||
SkewOptimal |
Mean-Variance analysis was introduced by (Markowitz 1952) as a model for
portfolio selection. In this model, the portfolio risk expressed through
the historical volatility is minimized for a given expected return, or
the expected return is maximized given the risk. The model is widely
used for making portfolio decisions, primarily because it can be easily
optimized using quadratic programming techniques. However, the model has
some shortcomings, among which the very important one, that it does not
consider the prior wealth of the investor that makes decisions. This
prior wealth is important since it influences the satisfaction that an
investor has from gains. For example, the gain of 50$ will not bring
the same satisfaction to someone whose wealth is 1$ as to someone whose
wealth is 1000$. This satisfaction further affects the decision-making
process in portfolio selection. Because of that and other financial
reasons, a more general concept of expected utility maximization is used
(see, e.g., (Schoemaker 2013)). However, the expected utility
maximization is not an easy optimization problem, and some additional
assumptions must be made in order to solve it. Hence, despite the
expected utility maximization being more general, the mean-variance
approach is still used due to its computational simplicity.
(Chamberlain 1983) showed that the two approaches
coincide if the returns are elliptically distributed. In other words,
under elliptical symmetry, the mean-variance optimization solves the
expected utility maximization for any increasing concave utility
function. Therefore, we want to test if the assumption of elliptical
symmetry holds or not for financial return data. The data set that we
analyze contains daily stock log-returns of
We test if the returns are elliptically symmetric in different time
periods using a rolling window analysis. The window has a size of one
year, and it is rolled every month, i.e., we start with the window
January 2000 - December 2000, and we test for elliptical symmetry. Then
we shift the starting point by one month, that is, we consider February
2000 - January 2001, and we test again for elliptical symmetry. We keep
doing this until the last possible window. The following tests are used
for every window: the test by Koltchinskii and Sakhanenko with R = 100
bootstrap replicates, the MPQ test, Schott’s test, the bootstrap test by
Huffer and Park based on orthants with c = 3
and with the number of
bootstrap replicates R = 100
, the Pseudo-Gaussian test, and the
SkewOptimal test with the default values of the parameters. For every
window, we calculate the
Even though all these tests address the null hypothesis of elliptical symmetry, they have different powers for different alternative distributions, and some tests may fail to detect certain departures from the null hypothesis. Certain tests are also by nature more conservative than others. We refer the reader to (Babić et al. 2021) for a comparative simulation study that includes the majority of the tests available in this package. This diversity in behavior presents nice opportunities. For instance, when all tests agree, we can be pretty sure about the nature of the analyzed data. One could also combine the six tests into a multiple testing setting by using a Bonferroni correction, though this is not what we are doing here.
The following general conclusions can be drawn from Figure 1.
|
|
|
|
|
|
|
|
|
|
|
|
The plots show the p-values of the corresponding tests for all rolling windows that we considered between 2000 and 2020. The years on the x-axis mark the rolling windows for which the starting point is January of that year. The horizontal line present on every plot indicates the 0.05 significance level. |
With the aim of guiding the reader through the functions that are available in the ellipticalsymmetry package, we now focus on the window January 2008 - December 2008. We start with the test by Koltchinskii and Sakhanenko.
> KoltchinskiiSakhanenko(data2008, R = 100)
Test for elliptical symmetry by Koltchinskii and Sakhanenko
data: data2008
statistic = 6.0884, p-value = 0.01
alternative hypothesis: the distribution is not elliptically symmetric
The KoltchinskiiSakhanenko()
output is simple and clear. It reports
the value of the test statistic and R = 100
.
The MPQ test and Schott’s test can be performed by running very simple commands:
> MPQ(data2008)
Test for elliptical symmetry by Manzotti et al.
data: data2008
statistic = 25.738, p-value = 0.04047
alternative hypothesis: the distribution is not elliptically symmetric
> Schott(data2008)
Schott test for elliptical symmetry
data: data2008
statistic = 24.925, p-value = 0.03531
alternative hypothesis: the distribution is not elliptically symmetric
Given the number of the input arguments, the function for the test by Huffer and Park deserves some further comments. The non-bootstrap version of the test can be performed by running the command:
> HufferPark(data2008, c = 3)
Test for elliptical symmetry by Huffer and Park
data: data2008
statistic = 24.168, p-value = 0.109
alternative hypothesis: the distribution is not elliptically symmetric
By specifying R
, the bootstrap will be applied:
> HufferPark(data2008, c= 3, R = 100)
The sector
, that is
"orthants"
.
Test for elliptical symmetry by Huffer and Park
data: data2008
statistic = 24.168, p-value = 0.15
alternative hypothesis: the distribution is not elliptically symmetric
If we want to change the type of sectors used to divide the space, we can do it by running the command:
HufferPark(data2008, c=3, R = 100, sector = "permutations")
This version yields a
Another very easy-to-use test is the Pseudo-Gaussian test:
> PseudoGaussian(data2008)
Pseudo-Gaussian test for elliptical symmetry
data: data2008
statistic = 9.4853, p-value = 0.02349
alternative hypothesis: the distribution is not elliptically symmetric
Eventually, the following simple command will run the SkewOptimal
test
based on the radial density of the multivariate param
argument).
> SkewOptimal(data2008)
SkewOptimal test for elliptical symmetry
data: data2008
statistic = 12.208, p-value = 0.006702
alternative hypothesis: the distribution is not elliptically symmetric
The test based on the radial density of the multivariate logistic
distribution can be performed by simply adding f = "logistic"
:
> SkewOptimal(data2008, f = "logistic")
This version of the SkewOptimal test yields a f
to "powerExp"
. The kurtosis parameter equal to
> SkewOptimal(data2008, f = "powerExp")
The resulting param
. For example,
SkewOptimal(data2008, f = "powerExp", param = 1.2)
We can conclude that the null hypothesis is rejected at the
In this paper, we have described several existing tests for elliptical symmetry and explained in detail their R implementation in our new package ellipticalsymmetry. The implemented functions are simple to use, and we illustrate this via a real data analysis. The availability of several tests for elliptical symmetry is clearly an appealing strength of our new package.
Slaana Babić was supported by a grant (165880) as a PhD Fellow of the Research Foundation-Flanders (FWO). Marko Palangetić was supported by the Odysseus program of the Research Foundation-Flanders.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Babić, et al., "Elliptical Symmetry Tests in R", The R Journal, 2021
BibTeX citation
@article{RJ-2021-078, author = {Babić, Slađana and Ley, Christophe and Palangetić, Marko}, title = {Elliptical Symmetry Tests in R}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {13}, issue = {2}, issn = {2073-4859}, pages = {661-672} }