The package lg for the R programming language provides implementations of recent methodological advances on applications of the local Gaussian correlation. This includes the estimation of the local Gaussian correlation itself, multivariate density estimation, conditional density estimation, various tests for independence and conditional independence, as well as a graphical module for creating dependence maps. This paper describes the lg package, its principles, and its practical use.
Tjøstheim and Hufthammer (2013) propose the local Gaussian correlation (LGC) as a new
measure of statistical dependence between two stochastic variables
We have lately seen a number of new applications of the LGC that the localgauss package does not support, however. Støve et al. (2014) use the LGC to test for financial contagion across markets during crises. Otneim and Tjøstheim (2017) present a procedure for estimating multivariate density functions via the LGC, which Otneim and Tjøstheim (2018) modify in order to compute estimates of conditional density functions. Lacal and Tjøstheim (2017) present a test for serial independence within a time series, which Lacal and Tjøstheim (2018) extend in order to include a test for cross-correlation between two time series. Finally, Otneim and Tjøstheim (2021) develop the local Gaussian partial correlation (LGPC) as a measure of conditional dependence and a corresponding test for conditional independence.
This paper describes the lg package (Otneim 2019), which provides a unified framework to implement all these methods, as well as a tool for visualizing the LGC and LGPC as dependence maps. Jordanger and Tjøstheim (2020) use the LGC in spectral analysis of time series, but those methods have their own computational ecosystem in the localgaussSpec package (Jordanger 2018).
In Section 2, we provide a brief introduction to the LGC as well as the methods and applications referred to above. In Section 3, we describe the core function in the lg package and move on to demonstrate the implementation of various applications in Section 4. We conclude this paper in Section 5 by demonstrating the graphical capabilities of the lg package.
Consider a random vector
A different approach is to estimate
Hjort and Jones (1996) provide one such idea. They consider a parametric family
Tjøstheim and Hufthammer (2013) consider the bivariate case
|
|
|
|
In Figure 1, we have plotted the estimated LGC for two bivariate data sets on a grid; 1000 simulated observations from a binormal distribution having correlation equal to 0.5, and the daily return on the CAC40 and FTSE100 stock indices on 1000 consecutive trading days starting on May 5th 2014 (Datastream 2018). In the first panel, we see that the estimated local correlation coincides with the global correlation, except for the estimation error which is comparable to the uncertainty observed in other nonparametric estimation methods such as the kernel density estimator (see, for instance, Otneim and Tjøstheim (2017) for a formal asymptotic analysis of relevant convergence rates). In the second panel, we see clearly that the local correlation, and thus the dependence, is stronger in the lower left and upper right regions of the distribution than in the central parts. The phenomenon of local dependence is well known in the financial literature, and using the LGC it can be measured, interpreted, and visualized in a natural way. The interpretation of this particular figure is that extreme observations on the two stock indices are more strongly dependent than the less extreme observations.
One may obtain these particular estimates from the older localgauss package (as well as the lg package, of course), but the plotting routine that was used to produce these figures is included in the lg package and will be described in more detail in Section 5.
Taking the LGC as a measure of dependence opens up a number of
possibilities to construct statistical tests. Berentsen and Tjøstheim (2014) show that
Consider next the stationary time series
Lacal and Tjøstheim (2018) extend this problem to test for serial cross-dependence
between two time series
We find another application of the local Gaussian approximation in work
by Støve et al. (2014), who measure and test for financial contagion.
They define contagion as "a significant increase in cross-market
linkages after a shock to one country" (Forbes and Rigobon 2002 2223) and
employ the LGC to quantify this potential linkage. The authors estimate
the LGC on a grid
Although the original work by Hjort and Jones (1996) provide a general
framework for local likelihood density estimation using any
Denote by
One particular feature enjoyed by the jointly normally distributed
vector
Finally, we refer to Otneim and Tjøstheim (2021) who take the local version of the
conditional covariance matrix ((12)) (or
((14)) in the transformed case) as a measure for
conditional dependence, and thus as an instrument to test for
conditional independence. Consider the stochastic vector
The local Gaussian correlation may be used to perform a number of statistical analyses, as is evident from the preceding section. The practitioner must first, however, make three quite specific modeling choices; namely (i) to choose an estimation method, i.e., the level of simplification in multivariate applications, (ii) to determine whether the data should be transformed towards marginal standard normality before estimating the LGC, and (iii) to choose a set of bandwidths or at least a method for calculating bandwidths. The architecture of the lg package requires the user to make these choices before endeavoring further into specific applications by imposing a strict, two-step procedure:
Create an lg-object.
Apply relevant analysis functions to the lg-object.
In the following, we assume that one has a data set x
loaded into
the R workspace, which must be an NA
s which will
be excluded from the analysis, or a data frame having the same
dimensions. The fundamental syntax for creating an lg-object is
lg_object <- lg_main(x)
, and we will, in this section, explain how the
modeling decisions (i)-(iii) can be encoded into the lg-object by using
appropriate arguments in this function.
Given a data set x
having est_method
to the lg_main()
-function.
We look at the built-in bivariate data set faithful
, which records the
waiting time between eruptions and the duration of the eruption for the
Old Faithful geyser in the Yellowstone National Park, USA (see the help
file in R for more details: ?faithful
), and load the
lg package in order to
demonstrate the implementation:
R> x <- faithful
R> library(lg)
1. A full locally Gaussian fit for bivariate data. If est_method = "5par"
in the call to
lg_main()
:
R> lg_object <- lg_main(x, est_method = "5par")
The resulting lg_object
is a list of class lg
, and we may confirm
that the assignment has been carried out correctly by inspecting its
est_method
-element:
R> lg_object$est_method
[1] "5par"
Note that the full locally Gaussian fit for raw data is not available if
the number of variables lg_main()
-function
will check for this and print out an error message if est_method = "5par"
.
2. A simplified locally Gaussian fit for multivariate data. As
described in the preceding section, we may construct a simplified
estimation procedure for calculating the LGC in two steps, which in
principle works for any number of dimensions (including
Calculate
Keep
To use this method, create the lg-object by running the following line:
R> lg_object2 <- lg_main(x, est_method = "5par_marginals_fixed")
3. A simplified locally Gaussian fit for marginally standard normal
data. This estimation method is applicable for marginally standard
normal data, or data that have been transformed to approximate marginal
standard normality by, e.g., the transformation ((7)). In
that case, we fix the marginal expectation functions and standard
deviation functions to the constant values
R> lg_object3 <- lg_main(x, est_method = "1par")
Note that the function call above will issue a warning if the option for
transforming the data to marginal standard normality is not at the same
time set to TRUE
, see the next sub-section on data transformation for
details.
4. A full locally Gaussian fit for trivariate data. If the number of
variables
R> lg_object4 <- lg_main(x, est_method = "trivariate")
This command will throw an error if the data set x
does not have
exactly three columns.
|
|
|
The same data transformed to marginal standard normality. |
Next, the user must determine if the local Gaussian correlation should
be estimated directly on the raw data or on the marginally normal pseudo
observations (7). This is carried out by using the logical
transform_to_marginal_normality
-argument in lg_main
, for example:
R> lg_object <- lg_main(x, transform_to_marginal_normality = TRUE)
The resulting lg_object
now includes the element
transform_to_marginal_normality
set according to the input, and if
this is TRUE
, it also includes the transformed_data
and a function
trans_new()
that may be used later to apply the same transformation
to, e.g., grid points. If the transformation option is set to FALSE
,
the transformed_data
element contains the input data x
, and
trans_new()
is nothing more than the identity mapping for points in
Finally, the user must specify a set of bandwidths or a method for
calculating them. Given that the different estimation methods described
in Section 3.1 require different sets of bandwidths
(i.e, joint, marginal, and/or pairwise), the easiest approach for the
user is to leave the selection and formatting of the bandwidths to the
lg_main()
-function.
The bandwidth plays a slightly different role in local likelihood
estimation than elsewhere in the nonparametric literature. It controls
the level of localization and thus only indirectly the smoothness of
the estimates. Indeed, suppose we concentrate on the univariate case for
the moment and assume that the (single) bandwidth
If the bandwidth
In the multivariate applications referred to in this paper, the
bandwidth
We have in practice seen two automatic bandwidth selectors employed in
the applications referred to in Section 2: a
cross-validation procedure that is fairly slow to compute but accurate
with respect to density estimation, and a plug-in bandwidth that is much
quicker to calculate but less accurate with respect to density
estimation. We use the argument bw_method
to the lg_main()
-function
in order to choose between the two.
1. Choosing bandwidths by cross-validation. The functional
lg_main()
:
R> # Create the lg-object with bandwidths chosen by cross-validation
R> lg_object <- lg_main(x,
R+ est_method = "5par_marginals_fixed",
R+ transform_to_marginal_normality = TRUE,
R+ bw_method = "cv")
The lg_object
now contains the necessary bandwidths for this
configuration, as can be seen by inspecting the contents of its
bw
-element:
R> # Print out the bandwidths
R> lg_object$bw
$marginal
[1] 0.9989327 0.9875333
$marginal_convergence
[1] 0 0
$joint
x1 x2 bw1 bw2 convergence
1 1 2 0.2946889 0.331971 0
This is itself a list, containing the crucial elements marginal
for
the joint
that contains the x1, x2
)). The convergence
flags stem from the built-in R functions optim()
and optimize()
that
we use to obtain the minimizer of
2. Using plug-in bandwidths. Obtaining cross-validated bandwidths is unfortunately fairly slow on a standard computer. For sample sizes in the 500-1000 range, the process may take several minutes, which is unfeasible when embarking on analyses that require, e.g., resampling. We have, therefore, implemented a quick plug-in bandwidth selector as well that may suffice in many practical situations, especially at the initial or exploratory stage.
Otneim and Tjøstheim (2017) show that the simplified version of the local Gaussian
fit have the same convergence rates as the corresponding nonparametric
kernel density estimator for which Silverman (1986) derives the plug-in
formula bw_method = "plugin"
, the lg_main()
-function will select the
bandwidths correspondingly, except that the exponent changes to
R> # Make the lg-object with plugin bandwidths
R> lg_object <- lg_main(x,
R+ est_method = "5par_marginals_fixed",
R+ transform_to_marginal_normality = TRUE,
R+ bw_method = "plugin")
R> # Print out the bandwidths
R> lg_object$bw
$marginal
[1] 0.5703274 0.5703274
$marginal_convergence
[1] NA NA
$joint
x1 x2 bw1 bw2 convergence
1 1 2 0.6875061 0.6875061 NA
Argument | Explanation | Default value |
---|---|---|
x |
The data, an |
|
bw_method |
Method for calculating the bandwidths | "plugin" |
est_method |
Estimation method | "1par" |
transform_to_ |
||
marginal_normality |
Transform the data | TRUE |
bw |
The bandwidths to use if already calculated | NULL |
plugin_constant_ |
||
marginal |
Prop. const. in plugin formula for marg. bw. | |
plugin_exponent_ |
||
marginal |
Exponent in plugin formula for marg. bw. | |
plugin_constant_ |
||
joint |
Prop. const. in plugin formula for joint bw. | |
plugin_exponent_ |
||
joint |
Exponent in plugin formula for joint bw. | |
tol_marginal |
Abs. tolerance when optimizing |
|
tol_joint |
Abs. tolerance when optimizing |
In the sub-section above, we present the three most important arguments
to lg_main()
. Each of them allows the user to configure one of the
three crucial modeling choices. Let us complete this treatment by
covering some possibilities to make further adjustments to those
choices.
The user may supply the bandwidths directly to lg_main()
by
passing them to the bw
-argument. They have to be in the correct
format, though, which is a list containing the vector $marginal
if
est_method = "5par_marginals_fixed"
, and always a data frame
$joint
specifying all variable pairs in the x1
and x2
columns
and the corresponding bandwidths in the bw1
and bw2
columns. The
function bw_simple()
will assist in creating bandwidth objects.
If bw_method = "plugin"
the user may change the proportionality
constant and exponent in the plugin formula for the joint and, if
applicable, the marginal bandwidths. See Table
1 for the necessary argument names.
If bw_method = "cv"
, the user may change the numerical tolerance
in the optimization of
lg
packageWe proceed in this section to demonstrate how to implement each of the tasks that we discussed in Section 2. The general pattern is to pass the lg-object to one of the estimation or test functions provided in the lg package. We will look at some financial data in the examples: the monthly returns on the S&P500, FTSE100, DAX30, and TOPIX stock indices from January 1985 to March 2018 (Datastream 2018).
We start by introducing a basic function for estimating the LGC on a
grid as described by Otneim and Tjøstheim (2017), and thus also a probability
density estimate. We create a grid, x0
, having the same number of
columns as the data in the code below. Note that we use the pipe
operator %>%
from the
magrittr package
(Bache and Wickham 2014) as well as functions from the
dplyr package
(Wickham et al. 2018) for easy manipulation of data frames. We then
pass the grid and the lg-object containing our modeling choices to the
dlg()
-function in order to do the estimation.
|
|
|
|
dlg()-function
.
R> # Create an lg-object
R> lg_object <- lg_main(x = stock_data %>% select(-Date),
R+ est_method = "1par",
R+ bw_method = "plugin",
R+ transform_to_marginal_normality = TRUE)
R>
R> # Construct a grid diagonally through the data.
R> grid_size <- 100
R> x0 <- stock_data %>%
R+ select(-Date) %>%
R+ apply(2, function(y) seq(from = -7,
R+ to = 7,
R+ length.out = grid_size))
R>
R> # Estimate the local Gaussian correlation on the grid
R> density_object <- dlg(lg_object, grid = x0)
The last line of code creates a list containing a number of elements.
The two most important are $loc_cor
, which is a matrix of local
correlations having one row per grid point and one column per pair of
variables (the columns correspond to the rows in
density_object$pairs
), and $f_est
, which is a vector containing the
estimate x0
. The estimated
local correlations for this example is plotted in Figure
3a, and the corresponding density estimate is plotted
(along the diagonal
The list density_object
contains the estimated standard deviations of
the local correlations in $loc_cor_sd
, as well as lower and upper
confidence bands $loc_cor_lower
and $loc_cor_upper
at the 95% level.
We refer to Table 2 for a complete overview of the arguments
to dlg()
.
Note that the configuration transform_to_marginal_normality = TRUE
and
est_method = 5par
in the bivariate case coincides with the situation
considered by Tjøstheim and Hufthammer (2013). In that case, dlg()
serves as a wrapper
for the function localgauss()
in the
localgauss-package
(Berentsen et al. 2014).
Obtaining the estimate of a conditional density using the
Otneim and Tjøstheim (2018) algorithm described in Section 2 is
very similar. However, one must take particular care of the ordering
of the variables in the data set. The estimation function, clg()
, will
always assume that the free variables come first and the conditioning
variables last. Let us illustrate this in the following code chunk by
estimating the joint conditional density of S&P500 and FTSE100, given
that DAX30 = TOPIX = 0.
Argument | Explanation | Default value |
---|---|---|
lg_object |
The lg-object created by lg_main() |
|
grid |
The evaluation points for the LGC | NULL |
level |
Level for confidence bands | 0.95 |
normalization |
The estimated density does not integrate to one by | |
_points c |
onstruction. dlg() will generate the given number |
|
of normal variables, having the same moments as | ||
the data, approximate |
||
integral, and then normalize the density estimate | ||
accordingly | ||
NULL |
||
bootstrap |
Calculate bootstrapped confidence intervals instead | |
of asymptotic expressions | FALSE |
|
B |
Number of bootstrap replicates | 500 |
R> # We must make sure that the free variables come first
R> returns1 <- stock_data %>% select(SP500, FTSE100, DAX30, TOPIX)
R>
R> # Create the lg-object
R> lg_object <- lg_main(returns1,
R+ est_method = "1par",
R+ bw_method = "plugin",
R+ transform_to_marginal_normality = TRUE)
R>
R> # Create a grid
R> x0 <- returns1 %>%
R+ select(SP500, FTSE100) %>%
R+ apply(2, function(y) seq(from = -7,
R+ to = 7,
R+ length.out = grid_size))
R>
R> # Calculate the conditional density
R> cond_density <- clg(lg_object, grid = x0, cond = c(0, 0))
The key argument in the call to clg()
above is cond = c(0, 0)
. This
means that the last two variables are conditioning variables (and hence,
that the first x0
plus the number of elements in cond
must equal the number of
variables clg()
will result in an
error message if this requirement is not fulfilled. The clg()
-function
takes mostly the same arguments as dlg()
listed in Table
2, and the conditional density estimate in our example is
available in the vector cond_density$f_est
.
Three independence tests based on the LGC have appeared in the literature thus far:
A test for independence between the stochastic variables
A test for serial independence between
A test for serial cross-dependence between
As we noted in Section 2, their practical
implementations are very similar, and the
lg package provides the
function ind_test()
to perform the tests. Let us first consider the
i.i.d. case, and generate 500 observations test_x
from the well known
parabola model
R> # Make the lg-object
R> lg_object <- lg_main(test_x,
R+ est_method = "5par",
R+ transform_to_marginal_normality = TRUE)
R> # Perform the independence test
R> test_result <- ind_test(lg_object, n_rep = 100)
R> # Print out the p-value of the test
R> test_result$p_value
[1] 0
This may take a few minutes to run on a desktop computer due to
bootstrapping. The small
Argument | Explanation | Default value |
---|---|---|
lg_object |
The lg-object created by lg_main() |
|
h |
The function |
function(x) x``^``2 |
S |
The integration area |
function(x) |
be a logical function on potential | as.logical(rep(1, |
|
grid points in |
nrow(x))) |
|
bootstrap |
The bootstrap method, must | |
_type |
be either "plain" , "block" or "stationary" |
"plain" |
block |
Block length for the block bootstrap, | |
_length |
mean block length for the stationary | |
bootstrap. Calculated by np::b.star() |
||
(Hayfield and Racine 2008) if not provided | NULL |
|
n_rep |
Number of bootstrap replicates | 1000 |
The only difference when testing for serial independence within a time
series
R> returns2 <- stock_data %>% select(SP500) %>%
R+ mutate(sp500_lagged = lag(SP500))
Finally, the only thing that we must alter in order to perform the third
test for serial cross-dependence is the bootstrap method. In the
applications above, it suffices to use the standard bootstrap, where we
resample with replacement from the data. This is implemented in the
ind_test()
-function by setting the bootstrap_type
-argument to
"plain"
, which is the default option. When testing for serial
cross-dependence, we need to use a block-bootstrap procedure, and
Lacal and Tjøstheim (2018) consider two options here: The block bootstrap with
either fixed (Kunsch 1989) or random (Politis and Romano 1994) block sizes. This
choice is specified by choosing bootstrap_type = "block"
or
bootstrap_type = "stationary"
, respectively, in the call to
ind_test()
. Lacal and Tjøstheim (2018) do not report significant differences in
test performance using the different bootstrap types.
Assume that we observe two financial time series
Let us illustrate the implementation of this test by looking at the same
financial returns data that we have used in preceding sections. However,
this time we will, in the spirit of Støve et al. (2014), concentrate on
GARCH(1,1)-filtrated daily returns on the S&P500 and FTSE100 indices
from 2 January 1985 to 29 April 1987 in order to test for financial
contagion between the US and UK stock markets following the global stock
market crash of 19 October 1987 (“Black Monday”). Assume that these
observations are loaded into the R workspace as the x_nc
containing the observations covering the x_c
containing
the observations covering the cont_test()
-function.
This function returns a list containing the estimated
Argument | Explanation | Default value |
---|---|---|
lg_object_nc |
The lg-object covering the non-crisis period | |
lg_object_c |
The lg-object covering the crisis period | |
grid_range |
Range of diagonal for measuring the LGC | (5%, 95%) quantiles |
grid_length |
The number of grid points to use | 30 |
n_rep |
Number of bootstrap replicates | 1000 |
weight |
Weight function | function(y) |
rep(1, nrow(y)) |
R> # Create the two lg-objects
R> lg_object_nc <- lg_main(x_nc,
R+ est_method = "5par",
R+ transform_to_marginal_normality = FALSE)
R>
R> lg_object_c <- lg_main(x_c,
R+ est_method = "5par",
R+ transform_to_marginal_normality = FALSE)
R>
R> # Run the test with a limited number of bootstrap replicates for
R> # demonstration purposes.
R> result <- cont_test(lg_object_nc, lg_object_c, n_rep = 100)
R>
R> # Print out the p-value
R> result$p_value
[1] 0.01
The small
Consider the work finally by Otneim and Tjøstheim (2021), who take the off-diagonal
element in the local correlation matrix corresponding to the local
conditional covariance matrix (12) or
(14) as a local measure of conditional dependence
between two stochastic variables partial_cor()
, calculates the
local partial correlations as well as their estimated standard
deviations on a specified grid in the
clg()
-function presented
in Section 4.1. See Table
5 for details. The second function,
ci_test()
, performs a test for conditional independence between the
first two variables in a data set given the remaining variables using
the test statistic ((17)) and a special bootstrap
procedure (described briefly below) for approximating the null
distribution.
Argument | Explanation | Default value |
---|---|---|
lg_object |
The lg-object created by lg_main() |
|
grid |
The evaluation points for the LGPC, must be a | |
data frame or matrix having 2 columns | NULL |
|
cond |
Vector with fixed values for |
NULL |
level |
Significance level for approximated confidence bands | 0.95 |
Argument | Explanation | Default value |
---|---|---|
lg_object |
The lg-object created by lg_main() |
|
h |
The function |
function(x) x``^``2 |
n_rep |
Number of bootstrap replicates | 500 |
It is well known in the econometrics literature that conditional
independence tests are instrumental in the empirical detection of
Granger causality (Granger 1980). For example, if we continue to
concentrate on the monthly stock returns data that we have already
loaded into memory, we may test whether
x
is a data frame having the following
columns strictly ordered as
The critical values of this test are calculated using the bootstrap
under the null hypothesis by independently resampling replicates from
the conditional density estimates
clg()
-function, using an approximated accept-reject
algorithm. In order to avoid excessive optimization of the local
likelihood function (1), we estimate
We refer to the documentation of the
lg package for details on how
to finely tune the behavior of the bootstrapping algorithm by altering
the arguments of the ci_test()
-function and limiting our treatment to
describing the arguments most suitable for modifications by the user in
Table 6.
R> # Create the lg-object
R> lg_object <- lg_main(returns4)
R>
R> # Perform the test
R> test_result <- ci_test(lg_object, n_rep = 100)
R>
R> # Print out result
R> test_result$p_value
[1] 0.51
The conditional independence test does not provide evidence against the null-hypothesis (18).
We conclude this article by describing the corplot()
function for
drawing dependence maps such as those displayed in Figure
1. Berentsen et al. (2014) report on such
capabilities in the
localgauss package,
but the possibility of creating dependence maps was unfortunately
removed from
localgauss in the
latest version 0.4.0 due to incompatibilities with the
ggplot2 (Wickham 2016)
plotting engine. We make up for this loss by providing corplot()
, a
function that plots the estimated local correlations as provided by
dlg()
, or the estimated local partial correlations as provided by
partial_cor()
.
The plotting function is highly customizable and provides a number of
options covering most basic graphical options. Users well versed in the
ggplot2 package may also
modify the graphical object returned by corplot()
in the standard way
by adding layers as demonstrated in the example below.
In the first example, we generate a set of bivariate normally
distributed data using the
mvtnorm package
(Genz et al. 2018) and estimate the local Gaussian correlation on a
regular grid
using the dlg()
-function. Passing the resulting
dlg_object
to corplot()
without further arguments results in Figure
4.
R> # Make a regular grid in the domain of the distribution
R> grid <- expand.grid(seq(-3, 3, length.out = 7),
R+ seq(-3, 3, length.out = 7))
R>
R> x <- mvtnorm::rmvnorm(500, sigma = matrix(c(1, rho, rho, 1), 2))
R> lg_object <- lg_main(x,
R+ est_method = "5par",
R+ transform_to_marginal_normality = FALSE,
R+ plugin_constant_joint = 4)
R> dlg_object <- dlg(lg_object, grid = grid)
R>
R> # Make a dependence map using default setup
R> corplot(dlg_object)
We may tweak the appearance of our dependence map by passing further
arguments to corplot()
. Some of the options are demonstrated in the
code chunk below, in which we, for example, superimpose the observations
(by setting plot_obs = TRUE)
and preventing the estimated local
correlations from being plotted in areas without data. The latter option
is available through the argument plot_thres
, which works by
calculating a bivariate kernel density estimate
plot_thres
. Adding layers to a dependence map using the ordinary
ggplot2 syntax works as
well, which we demonstrate in Figure 5 by changing the
ggplot2 theme.
The plotting function works in the same way when plotting the local
partial correlations returned by partial_cor()
, and the arguments of
corplot()
are summarized in Table 7.
Argument | Explanation | Default value |
---|---|---|
dlg_object |
The object created by dlg() or partial_cor() |
|
pair |
Which pair to plot if more than two variables | 1 |
gaussian_scale |
Logical. Plot on the marginal st. normal scale? | FALSE |
plot_colormap |
Logical. Plot the colormap? | TRUE |
plot_obs |
Logical. Superimpose observations? | FALSE |
plot_labels |
Logical. Plot labels on dependence map? | TRUE |
plot_legend |
Logical. Add legend? | FALSE |
plot_thres |
Threshold for plotting the estimated LGC | 0 |
alpha_tile |
Transparency of color tiles | 0.8 |
alpha_point |
Transparency of points | 0.8 |
low_color |
Color representing |
"blue" |
high_color |
Color representing |
"red" |
break_int |
Break interval for color coding | 0.2 |
label_size |
Size of labels in plot | 3 |
font_family |
Font family for labels | "sans" |
point_size |
Size of points, if plotted | NULL |
xlim , ylim |
Axis limits | NULL |
xlab , ylab |
Axis labels | NULL |
rholab |
Title of legend | NULL |
main , subtitle |
Title and subtitle of plot | NULL |
R> corplot(dlg_object1,
R+ plot_obs = TRUE,
R+ plot_thres = 0.01,
R+ plot_labels = FALSE,
R+ alpha_point = 0.3,
R+ main = "",
R+ xlab = "",
R+ ylab = "") +
R+ theme_classic()
corplot()
using the default
configurationcorplot()
The statistical literature has seen a number of applications of local Gaussian approximations in the last decade, covering several topics in dependence modeling and inference, as well as the estimation of multivariate density and conditional density functions. In this paper, we demonstrate the implementation of these methods in the R programming language using the lg package, as well as the graphical representation of the estimated local Gaussian correlation. The package is complete in the sense that all major methods that have been published within this framework is now easily accessible to the practitioner. The package is also designed with a modular infrastructure that allows future methodological developments using local Gaussian approximations to be easily added to the package.
The author gratefully acknowledges the constructive comments made by two anonymous referees.
lg, localgauss, magrittr, dplyr, ggplot2, mvtnorm
Databases, Distributions, Finance, ModelDeployment, Phylogenetics, Spatial, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Otneim, "lg: An R package for Local Gaussian Approximations", The R Journal, 2021
BibTeX citation
@article{RJ-2021-079, author = {Otneim, Håkon}, title = {lg: An R package for Local Gaussian Approximations}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {13}, issue = {2}, issn = {2073-4859}, pages = {38-56} }