The Autoregressive Distributed Lag approach to cointegration or bound testing, proposed by Pesaran in 2001, has become prominent in empirical research. Although this approach has many advantages over the classical cointegration tests, it is not exempt from drawbacks, such as possible inconclusive inference and distortion in size. Recently, Bertelli and coauthors developed a bootstrap approach to the bound tests to overcome these drawbacks. This paper introduces the R package bootCT, which implements this method by deriving the bootstrap versions of the bound tests and of the asymptotic F-test on the independent variables proposed by Sam and coauthors in 2019. As a spinoff, a general method for generating random multivariate time series following a given VECM/ARDL structure is provided in the package. Empirical applications showcase the main functionality of the package.
Cointegration and error correction are fundamental concepts in the
analysis of economic data, insofar as they provide an appropriate
framework for testing economic hypotheses about growth and fluctuation.
Several approaches have been proposed in the literature to determine
whether two or more non-stationary time series are cointegrated, meaning
they share a common long-run relationship.
There are two basic types of tests for cointegration: single equation
tests and VAR-based tests. The former check the presence of unit roots
in cointegration residuals (see, e.g., Engle and Granger 1987; Engle and Yoo 1987; Mackinnon 1991; Gabriel et al. 2002; Cook 2006)
or test the significance of the error-correction (EC) term coefficient
(Kremers et al. 1992; Maddala and Kim 1998; Arranz and Escribano 2000; Ericsson and MacKinnon 2002). The
latter, such as the Johansen (1991) approach, tackle the problem of
detecting cointegrating relationships in a VAR model. This latter
approach, albeit having the advantage of avoiding the issue of
normalization, as well as allowing the detection of multiple
cointegrating vectors, is far from being perfect. In the VAR system all
variables are treated symmetrically, as opposed to the standard
univariate models that usually have a clear interpretation in terms of
exogenous and endogenous variables. Furthermore, in a VAR system all the
variables are estimated at the same time, which is problematic if the
relation between some variables is flawed, that is affected by some
source of error. In this case a simultaneous estimation process tends to
propagate the error affecting one equation to the others. Furthermore, a
multidimensional VAR models employs plenty of degrees of freedom.
The recent cointegration approach, known as Autoregressive Distributed
Lag (ARDL) approach to cointegration or bound testing, proposed by
Pesaran et al. (2001) (PSS), falls in the former strand of literature. It has
become prominent in empirical research because it shows several
advantages with respect to traditional methods for testing
cointegration. First, it is applicable also in cases of mixed order
integrated variables, albeit with integration not exceeding the first
order. Thus, it evades the necessity of pre-testing the variables and,
accordingly, avoids some common practices that may prevent finding
cointegrating relationships, such as dropping variables or transforming
them into stationary form (see McNown et al. 2018). Second,
cointegration bound tests are performed in an ARDL model that allows
different lag orders for each variable, thus providing a more flexible
framework than other commonly employed approaches. Finally, unlike other
cointegration techniques, which are sensitive to the sample size, the
ARDL approach provides robust and consistent results for small sample
sizes.
Notably, the ARDL bound testing methodology has quickly spread in
economics and econometrics to study the cointegrating relationships
between macroeconomic and financial variables, to evaluate the long-run
impact of energy variables, or to assess recent environmental policies
and their impact on the economy. Among the many applications, see for
instance Haseeb et al. (2019; Hussain et al. 2019; Menegaki 2019; Reda and Nourhan 2020; Yilanci et al. 2020; Abbasi et al. 2021).
The original bound tests proposed by Pesaran et al. (2001) are an
Recently, McNown et al. (2018) pointed out how, due to the low power
problem of unit root tests, investigating the presence of a first type
degeneracy by testing the integration order of the dependent variable
may lead to incorrect conclusions. Therefore, they suggested checking
for its occurrence by testing the significance of the lagged levels of
the independent variables via an extra
Besides problems in testing the occurrence of degenerate cases, in
general, the main drawback of the bound tests is the occurrence of
potentially inconclusive results, if the test statistic lies between the
bounds of the test distribution under the null. Furthermore, the
asymptotic distributions of the statistics may provide a poor
approximation of the true distributions in small samples. Finite sample
critical values, even if only for a subset of all possible model
specifications, have been worked out in the literature (see Mills and Pentecost 2001; Narayan and Smyth 2004; Kanioura and Turner 2005; Narayan 2005),
while (Kripfganz and Schneider 2020) provided the quantiles of the asymptotic
distributions of the tests as functions of the sample size, the lag
order and the number of long-run forcing variables. However, this
relevant improvement does not eliminate the uncertainty related to the
inconclusive regions, or the existence of other critical issues related
to the underlying assumptions of the bound test framework, such as the
(weak) exogeneity of the independent variables or the non-stationarity
of the dependent variable.
To overcome the mentioned bound test drawbacks, (Bertelli et al. 2022)
proposed bootstrapping the ARDL cointegration test. Inference can always
be pursued with ARDL bootstrap tests, unlike what happens with both the
PSS tests and the SMK test on the independent variables. Bootstrap ARDL
tests were first put forward by (McNown et al. 2018) in an
unconditional ARDL model, which omits the instantaneous differences of
the exogenous variables in the ARDL equation, rather than a conditional
one, as originally proposed by (Pesaran et al. 2001). The unconditional model
is often used, for reason of practical convenience, in empirical
research. Simulation results in (Bertelli et al. 2022) have
highlighted the importance of employing the appropriate specification,
especially under degenerate cases. In fact, it has been pointed out that
a correct detection of these cases requires the comparison of the test
outcomes in both the conditional and unconditional settings. Erroneous
conclusions, based exclusively on one model specification, can thus be
avoided.
In this paper, bootstrap bound tests, thereby including the bootstrap
versions of the
The paper is organized as follows. Section 2
introduces the theoretical results of the ARDL cointegration bound
tests. Section 3 details the steps carried out by the
bootstrap procedure, which allows the construction of the (bootstrap)
distribution - under the null - for the R
package
bootCT (Vacca and Bertelli 2023) and
its functionalities: a method for the generation of random multivariate
time series that follow a user-specified VECM/ARDL structure, with some
examples, and the main function that carries out the aforementioned
bootstrap tests, while also computing the PSS and SMK bound tests. The
trade-off between accuracy and computational time of the bootstrap
procedure is also investigated, under several scenarios in terms of
sample size and number of replications. Notably, a function that
performs the PSS bound tests is already available in the
dynamac package
(Jordan and Philips 2020), while no R
routine has so far been implemented for the
SMK test, to the best of our knowledge. Section 5 gives some
empirical applications that employ the core function of the package and
its possible outputs. Section 6 concludes. Appendix
7 briefly delves into technical details of the
conditional ARDL model and its possible specifications R
packages, either used in the creation of
bootCT or employed
in the analyses presented in this paper, are
magrittr
(Bache and Wickham 2022), gtools
(Bolker et al. 2022), pracma
(Borchers 2022), Rcpp
(Eddelbuettel 2013),
RcppArmadillo
(Eddelbuettel et al. 2023),
Rmisc (Hope 2022),
dynamac
(Jordan and Philips 2020), ARDL
(Natsiopoulos and Tzeremes 2021), aod
(Lesnoff et al. 2012), vars and
urca
(Pfaff 2008b; Pfaff 2008a),
aTSA (Qiu 2015),
tseries
(Trapletti and Hornik 2023),
reshape2,
ggplot2 and
stringr
(Wickham 2007, 2016, 2022),
tidyverse and
dplyr
(Wickham et al. 2019, 2023).
The starting point of the approach proposed by (Pesaran et al. 2001) is a
The VECM representation of (1) is given by (see Appendix
7.1 for details)
Substituting (5) into (2) and assuming that
the
To test the hypothesis of cointegration between
Note that
The exact distribution of the
Pesaran et al. (2001) worked out two sets of (asymptotic) critical values: one,
In this regard, Pesaran et al. (2001) introduced five different specifications
for the ARDL model, depending on its deterministic components, which are
(see Appendix 7.2 for details)
No intercept and no trend
where
Restricted intercept and no trend
where
Unrestricted intercept and no trend
where
Unrestricted intercept, restricted trend
where
Unrestricted intercept, unrestricted trend
where
The model in (6) proposed by Pesaran et al. (2001) represents the
correct framework in which to carry out bound tests. However, bound test
are often performed in an unconditional ARDL model setting, specified as
(Bertelli et al. 2022) have highlighted that bootstrap tests performed
in these two ARDL specifications can lead to contrasting results. To
explain this divergence, note that the conditional model makes use of
the following vector in the EC term
In the following, bootstrap tests are carried out in the conditional
ARDL model (6). However, when a degeneracy of first type
occurs in the unconditional model, the outcomes of the
The bootstrap procedure here proposed focuses on a ARDL model specified
as in (14)-(18), depending on the assumptions on
the deterministic components.
The bootstrap procedure consists of the following steps:
The ARDL model is estimated via OLS and the related test statistics
In order to construct the distribution of each test statistic under the corresponding null, the same model is re-estimated imposing the appropriate restrictions on the coefficients according to the test under consideration.
Following (McNown et al. 2018), the ARDL restricted residuals
are then computed. For example, under Case III, the residuals are
The VECM model
A large set of
Each set of
A sequential set of
The initial conditions, that is the observations before
An unrestricted ARDL model is estimated via OLS using the
bootstrap observations, and the statistics
The bootstrap distributions of
Here,
The null hypothesis is rejected if the
This section describes the main functionalities of the
bootCT package. The
functions included in the package are essentially of two types. The
function sim_vecm_ardl
generates data according to a given data
generating process (DGP), assuming either the presence or the absence of
cointegrating relationships between variables, or degenerate cases. The
function boot_ardl
tests the presence of cointegrating relationships
employing the Pesaran ARDL bound tests (
sim_vecm_ardl
functionThe function sim_vecm_ardl
allows to simulate a multivariate time
series from a given conditional ARDL specification for a dependent
variable
A group of parameters pertains the VECM model (6) and
(7), with
The second group of parameters concerns the model intercept and trend of
the VAR specification,
The approach used to specify the function inputs offers great control to
the user, in terms of generating specific (conditional) ARDL-based
cointegration structures.
The function sim_vecm_ardl
takes the following arguments:
nobs
: number of observations to generate;
case
: indicates the conditional ARDL specification in terms of
deterministic component (intercept and trend) among the five
specifications proposed by Pesaran et al. (2001), given in
(14)-(18).
sigma.in
: covariance matrix,
gamma.in
: list of short-run parameter matrices
axx.in
: cointegrating relationships,
ayx.uc.in
: vector of parameters, as in
ayy.in
: the
mu.in
: mean vector,
eta.in
: trend vector,
azero.in
: unrestricted intercept of the VECM specification (valid
only for cases III, IV and V), when the intercept is not involved in
the EC term;
aone.in
: unrestricted coefficient of the trend in the VECM
specification (valid only for case V), when the trend is not
involved in the EC term;
burn.in
: additional observations burn-in observations to be
generated. A total of burn.in + nobs
observations are generated,
but only the last nobs
are kept in the data;
seed.in
: seed number for the generation of
If parameter values for mu.in
, eta.in
, azero.in
, or aone.in
and
case number turn out to be in contradiction, an error message is
displayed.
As output, the function gives out a list containing the data, both in
level and first difference, along with all the parameter values given as
input. Additionally, all intermediate transformation of parameters via
VECM transformation or as a by-product of conditioning
Figure 1 depicts three-time series, dep_1_0
,
ind_1_0
and ind_2_0
, generated using this function and
affected by a cointegrating relationship, one panel for each case, from
I to V. The variable dep_1_0
represents the dependent variable
ind_1_0
and ind_2_0
the independent ones,
The code used to generate the data for case I is the following:
corrm = matrix(c( 0, 0, 0,
0.25, 0, 0,
0.4, -0.25, 0), nrow = 3, ncol = 3, byrow = T)
Corrm = (corrm + t(corrm)) + diag(3)
sds = diag(c(1.3, 1.2, 1))
sigma.in = (sds %*% Corrm %*% t(sds))
gamma1 = matrix(c(0.6, 0, 0.2,
0.1, -0.3, 0,
0, -0.3, 0.2), nrow = 3, ncol = 3,byrow=T)
gamma2= gamma1 * 0.3
omegat = sigma.in[1, -1] %*% solve(sigma.in[-1, -1])
axx.in = matrix(c( 0.3, 0.5,
-0.4, 0.3), nrow = 2, ncol = 2, byrow = T)
ayx.uc.in = c(0.4, 0.4)
ayy.in = 0.6
data.vecm.ardl_1 =
sim_vecm_ardl(nobs = 200,
case = 1,
sigma.in = sigma.in,
gamma.in = list(gamma1, gamma2),
axx.in = axx.in,
ayx.uc.in = ayx.uc.in,
ayy.in = ayy.in,
mu.in = rep(0, 3),
eta.in = rep(0, 3),
azero.in = rep(0, 3),
aone.in = rep(0, 3),
burn.in = 100,
seed.in = 999)
Additionally, Figure 2 displays other three time
series, dep_1_0
(ind_1_0
(ind_2_0
(dep_1_0
on ind_1_0
, ind_2_0
. The five panels
represents the behavior of these series in the Cases from I to V. It is
worth noting the different scenario implied by these cases: case III
depicts a trend for the
Finally, the flowchart in Figure 3 details the
internal steps of the function sim_vecm_ardl
and the data generation
workflow. There, it is specified how the parameters of the VAR, VECM and
ARDL equation are introduced. Attention is paid on whether the error
correction mechanism involves either intercept or trend (or both) via
the internal computation of the parameters
Figure 1: Simulated data from the VECM / conditional ARDL specifications, for every case. Made with ggplot .
Figure 2: Simulated data from the VECM / conditional ARDL specifications (degenerate case of type 2, a_{yy}=0), for every case. Made with ggplot.
Figure 3: Flowchart of the sim_vecm_ardl function inner steps. When applying (7) and (8),
boot_ardl
functionThis function develops the bootstrap procedure detailed previously. As
an option in the initial estimation phase, it offers the possibility of
automatically choosing the best order for the lagged differences of all
the variables in the ARDL and VECM models. This is done by using several
criteria. In particular, AIC, BIC, AICc,
In particular, the auto_ardl
function in the package
ARDL (Natsiopoulos and Tzeremes 2021) selects
the best ARDL order in terms of the short-run parameter vectors
VARselect
function in the
package vars (Pfaff 2008b)
selects the best VECM order in terms of the short-run parameter matrices
The function boot_ardl
takes the following arguments:
data
: input dataset. Must contain a dependent variable and a set
of independent variables;
yvar
: name of the dependent variable enclosed in quotation marks.
If unspecified, the first variable in the dataset is used;
xvar
: vector of names of the independent variables, each enclosed
in quotation marks. If unspecified, all variables in the dataset
except the first are used;
fix.ardl
: vector
info.ardl
: (alternatively to fix.ardl
) the information criterion
used to choose the best lag order for the short term part of the
ARDL equation. It must be one between AIC
(default), AICc
,
BIC
, R2
, , adjR2
;
fix.vecm
: scalar
info.vecm
: (alternatively to fix.vecm
) the information criterion
used to choose the best lag order for the short term part of the
VECM equation. Must be one among AIC
(default), HQIC
, SC
,
FPE
;
maxlag
: (in conjunction with info.ardl
/ info.vecm
) maximum
number of lags for the auto_ardl
function in the package
ARDL, and for the
VARselect
function in the package
vars;
a.ardl
: significance threshold for the short-term ARDL
coefficients (
a.vecm
: significance threshold for the short-term VECM
coefficients (in
nboot
: number of bootstrap replications;
case
: type of the specification for the conditional ARDL in terms
of deterministic components (intercept and trend) among the five
proposed by (Pesaran et al. 2001), given in
(14)-(18);
a.boot.H0
: probability/ies
print
: if set to TRUE
, shows the progress bar.
boot_ardl
makes use of the lag_mts
function which produces lagged
versions of a given matrix of time series, each column with a separate
order. lag_mts
takes as parameters the data included in a matrix X
and the lag orders in a vector k
, with the addition of a boolean
parameter last.only
, which allows to specify whether only the
boot_ardl
also acts as a wrapper for the most common methodologies
detecting cointegration, offering a comprehensive view on the testing
procedures involved in the analysis. The resulting object, of class
bootCT
, contains all the information about
The conditional ARDL model estimates, and the unconditional VECM model estimates;
the bootstrap tests performed in the conditional ARDL model;
the Pesaran, Shin and Smith bound testing procedure (
the Sam, McNown and Goh bound testing procedure for
the Johansen rank and trace cointegration tests on the independent variables.
Internally, the bootstrap data generation under the null is executed via
a Rcpp
function, employing the
Rcpp and
RcppArmadillo
packages (Eddelbuettel 2013), so as to greatly speed up computational times. As
explained in the previous section, cointegration tests in the
unconditional ARDL model are performed in order to uncover the presence
of spurious cointegrating relationships.
To this end, the function provides
the bootstrap critical values of the a.boot.H0
, along with the
same statistics computed in the conditional model.
a flag, called fakecoint
, that indicates divergence between the
outcomes of the
A summary
method has been implemented to present the results in a
visually clear manner. It accepts the additional argument "out
" that
lets the user choose which output(s) to visualize: ARDL
prints the
conditional ARDL model summary, VECM
prints the VECM model summary,
cointARDL
prints the summary of the bound tests and the bootstrap
tests, cointVECM
prints the summary of the Johansen test on the
independent variables.
A detailed flowchart showing the function’s workflow is displayed in
Figure 4. There, the expressions "C ARDL"
and "UC ARDL" stand for conditional and unconditional ARDL model,
respectively.
Figure 4: Flowchart of the boot_ardl function inner steps. Boxes denote parameter definitions and transformations. Diamonds denote function outputs. Dashed diamonds denote intermediate output (not shown after function call). Empty nodes denote function inputs. The first p+1 rows of
In order to investigate the sensitivity of the procedure to different
sample sizes and number of bootstrap replicates, an experiment has been
run using a three-dimensional time series of length
sim_vecm_ardl
function (Case II, with cointegrated variables,
and 2 lags in the short-run section of the model).
Then, the boot_ardl
function has been called
boot_ardl(data = df_sim,
nboot = bootr,
case = 2,
fix.ardl = rep(2, 3),
fix.vecm = 2)
In the code above, bootr
has been set equal to
fix.ardl
and fix.vecm
), while default values have been used for
every other argument (such as a.ardl
, a.vecm
and a.boot.H0
).
Table 1 shows the average running time per replication
together with the coefficient of variation (%) of the bootstrap critical
values of the
Naturally, the running time increases as both sample size and bootstrap
replicates increase. However, it can be noticed how the coefficients of
variation tend to stabilize for
Exec. Time (sec) | |||||
---|---|---|---|---|---|
50 | 200 | 23.38 | 8.648 | 10.925 | 13.392 |
50 | 500 | 48.37 | 6.312 | 6.952 | 8.640 |
50 | 1000 | 96.65 | 4.806 | 5.613 | 6.288 |
50 | 2000 | 231.15 | 4.255 | 4.226 | 4.946 |
80 | 200 | 23.46 | 7.251 | 8.936 | 11.263 |
80 | 500 | 50.19 | 4.998 | 6.220 | 7.946 |
80 | 1000 | 143.00 | 3.882 | 4.453 | 5.305 |
80 | 2000 | 255.64 | 2.912 | 3.623 | 4.518 |
100 | 200 | 37.89 | 7.707 | 8.583 | 10.955 |
100 | 500 | 52.86 | 4.691 | 5.304 | 7.557 |
100 | 1000 | 184.51 | 3.512 | 4.567 | 5.695 |
100 | 2000 | 212.65 | 3.519 | 3.674 | 4.185 |
200 | 200 | 35.46 | 6.644 | 7.173 | 10.365 |
200 | 500 | 76.78 | 4.734 | 5.355 | 6.225 |
200 | 1000 | 148.25 | 3.124 | 4.177 | 5.034 |
200 | 2000 | 484.51 | 2.811 | 3.361 | 3.907 |
500 | 200 | 54.47 | 6.641 | 8.694 | 10.414 |
500 | 500 | 133.17 | 5.137 | 5.816 | 6.408 |
500 | 1000 | 271.87 | 3.905 | 4.585 | 5.283 |
500 | 2000 | 561.71 | 3.221 | 3.490 | 4.145 |
Table 1: Average execution times (in seconds) of the boot_ardl
function, for different combinations of sample size
This section provides two illustrative application which highlight the performance of the bootstrap ARDL tests.
In the first example, the occurrence of a long-run relationship between
consumption [C], income [INC], and investment [INV] of Germany has
been investigated via a set of ARDL models, where each variable takes in
turn the role of dependent one, while the remaining are employed as
independent. The models have been estimated by employing the dataset of
Lütkepohl (2005) which includes quarterly data of the series over the
years 1960 to 1982. The data have been employed in logarithmic form.
Figure 5 displays these series over the sample
period.
Before applying the bootstrap procedure, the order of integration of
each series has been analyzed. Table 2 shows the results of
ADF test performed on both the series and their first-differences (
The following ARDL equations have been estimated:
First ARDL equation (C | INC, INV):
Second ARDL equation (INC | C, INV):
Third ARDL equation (INV | C, INC):
Table 3 shows the estimation results for each ARDL and VECM
model. It is worth noting that the instantaneous difference of the
independent variables are highly significant in each conditional ARDL
model. Thus, neglecting these variables in the ARDL equation, as happens
in the unconditional version of the model, may potentially lead to
biased estimates and incorrect inference. For the sake of completeness,
also the results of the marginal VECM estimation are reported for each
model.
The code to prepare the data, available in the package as the
ger_macro
dataset, is:
data("ger_macro")
LNDATA = apply(ger_macro[,-1], 2, log)
col_ln = paste0("LN", colnames(ger_macro)[-1])
LNDATA = as.data.frame(LNDATA)
colnames(LNDATA) = col_ln
Then, the boot_ardl
function is called, to perform the bootstrap
tests. In the code chunk below, Model I is considered.
set.seed(999)
BCT_res_CONS = boot_ardl(data = LNDATA,
yvar = "LNCONS",
xvar = c("LNINCOME", "LNINVEST"),
maxlag = 5,
a.ardl = 0.1,
a.vecm = 0.1,
nboot = 2000,
case = 3,
a.boot.H0 = c(0.05),
print = T)
to which follows the call to the summary
function
summary(BCT_res_CONS, out = "ARDL")
summary(BCT_res_CONS, out = "VECM")
summary(BCT_res_CONS, out = "cointVECM")
summary(BCT_res_CONS, out = "cointARDL")
The first summary line displays the output in the ARDL column of Table
3 and the second column of Table 4, Model
I. The second line corresponds to the VECM columns of Table
3, Model I - only for the independent variables. The
information on the rank of the "cointARDL"
summary, if detected.
In this example, the bootstrap and bound testing procedures are in
agreement only for model I, indicating the existence of a cointegrating
relationship. Additionally, no spurious cointegration is detected for
this model. As for models II and III, the null hypothesis is not
rejected by the bootstrap tests, while the PSS and SMG bound tests fail
to give a conclusive answer in the
The running time of the entire analysis is of roughly 11 minutes, using
an Intel(R) Core(TM) i7-1165G7 CPU @ 2.80GHz processor, 16GB of RAM.
level variable | first difference | ||||
---|---|---|---|---|---|
Series | lag | ADF | p.value | ADF | p-value |
0 | -1.690 | 0.450 | -9.750 | <0.01 | |
1 | -1.860 | 0.385 | -5.190 | <0.01 | |
2 | -1.420 | 0.549 | -3.130 | 0.030 | |
3 | -1.010 | 0.691 | -2.720 | 0.080 | |
0 | -2.290 | 0.217 | -11.140 | <0.01 | |
1 | -1.960 | 0.345 | -7.510 | <0.01 | |
2 | -1.490 | 0.524 | -5.120 | <0.01 | |
3 | -1.310 | 0.587 | -3.290 | 0.020 | |
0 | -1.200 | 0.625 | -8.390 | <0.01 | |
1 | -1.370 | 0.565 | -5.570 | <0.01 | |
2 | -1.360 | 0.570 | -3.300 | 0.020 | |
3 | -1.220 | 0.619 | -3.100 | 0.032 |
Figure 5: log-consumption/investment/income graphs (level variables and first differences). Made with ggplot.
Model I | Model II | Model III | |||||||
---|---|---|---|---|---|---|---|---|---|
ARDL | VECM | ARDL | VECM | ARDL | VECM | ||||
-0.307 *** (0.055) | 0.168 * (0.081) | -0.0011 (0.0126) | 0.1286 * (0.0540) | 0.611 . (0.339) | -0.2727 *** (0.0704) | -0.0508 (0.0796) | |||
0.297 *** (0.055) | 0.124 * (0.054) | -0.017 (0.014) | -0.183 * (0.079) | -0.491 (0.340) | 0.2619 *** (0.0681) | 0.0464 (0.0772) | |||
-0.001 (0.011) | -0.152 * (0.063) | 0.016 (0.017) | 0.0209 (0.0135) | -0.00107 (0.0142) | -0.1531 * (0.0607) | -0.1212 * (0.060) | |||
-0.248 ** (0.079) | 0.899 * (0.442) | 0.211 . (0.113) | 0.375 *** (0.1086) | 0.9288 * (0.442) | 1.113 * (0.441) | 0.2072 . (0.1142) | |||
0.744 (0.431) | 0.8049 . (0.4345) | ||||||||
-0.1404 (0.1095) | |||||||||
0.2675 ** (0.0958) | 0.1522 . (0.0912) | ||||||||
-0.18 (0.111) | 0.035 (0.029) | -0.189 . (0.1097) | -0.175 (0.1075) | 0.0479 . (0.0282) | |||||
0.049 . (0.027) | 0.0591 * (0.0245) | 0.0578 * (0.0223) | 0.0562 * (0.0266) | ||||||
0.7070 *** (0.1093) | 1.8540 *** (0.5425) | ||||||||
0.471 *** (0.074) | -0.445 *** (0.4726) | ||||||||
0.065 ** (0.019) | -0.0230 (0.025) | ||||||||
const. | 0.048 *** (0.013) | 0.036 (0.066) | 0.033 * (0.017) | 0.002 (0.018) | 0.0266 . (0.0155) | 0.023 (0.0666) | -0.056 (0.072) | 0.0517 ** (0.0157) | 0.0378 * (0.0177) |
J-test |
PSS / SMG Threshold | Outcome | |||||||
---|---|---|---|---|---|---|---|---|
Model | Lags | Test | Boot. Critical Values | I(0) 5% | I(1) 5% | Statistic | Boot | Bound |
I | (1,0,0) | 3.79 | 3.79 | 4.85 | 10.75 | Y | Y | |
-2.88 | -2.86 | -3.53 | -5.608 | |||||
4.92 | 3.01 | 5.42 | 15.636 | |||||
II | (1,1,0) | 5.79 | 3.79 | 4.85 | 2.867 | N | U | |
-3.69 | -2.86 | -3.53 | -2.315 | |||||
7.38 | 3.01 | 5.42 | 3.308 | |||||
III | (1,1,0) | 5.50 | 3.79 | 4.85 | 3.013 | N | U | |
-3.32 | -2.86 | -3.53 | -2.020 | |||||
6.63 | 3.01 | 5.42 | 4.189 |
Following Bertelli et al. (2022), the relationship between foreign direct investment [FDI], exports [EXP], and gross domestic product [GDP] in Italy is investigated. The data of these three yearly variables have been retrieved from the World Bank Database and cover the period from 1970 to 2020. In the analysis, the log of the variables has been used and [EXP] and [FDI] have been adjusted using the GDP deflator. Figure 6 displays these series over the sample period.
Figure 6: log-GDP/export/investment graphs (level variables and first differences). Made with ggplot.
Table 5 shows the outcomes of the ADF test performed on each variable, which ensures that the integration order is not higher than one for all variables. Table 6 shows the results of bound and bootstrap tests performed in ARDL model by taking each variable, in turn, as the dependent one. The following ARDL equations have been estimated:
First ARDL equation (GDP | EXP, FDI):
Second ARDL equation (EXP | GDP, FDI):
Third ARDL equation (FDI | GDP, EXP):
The code to load the data and perform the analysis (e.g. for Model I) is:
data("ita_macro")
BCT_res_GDP = boot_ardl(data = ita_macro,
yvar = "LGDP",
xvar = c("LEXP", "LFI"),
maxlag = 5,
a.ardl = 0.1,
a.vecm = 0.1,
nboot = 2000,
case = 3,
a.boot.H0 = c(0.05),
print = T)
For the sake of simplicity, the conditional ARDL and VECM marginal models outputs included in each cointegrating analysis is omitted. The summary for the cointegration tests for Model I is called via
summary(BCT_res_GDP, out = "ARDL") # extract lags
summary(BCT_res_GDP, out ="cointARDL") # ARDL cointegration
This empirical application further highlights the importance of dealing with inconclusive inference via the bootstrap procedure, while naturally including the effect of conditioning in the ARDL model, as highlighted in Bertelli et al. (2022).
No Drift, No Trend | Drift, No Trend | Drift and Trend | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Variable | Lag = 0 | Lag = 1 | Lag = 2 | Lag = 3 | Lag = 0 | Lag = 1 | Lag = 2 | Lag = 3 | Lag = 0 | Lag = 1 | Lag = 2 | Lag = 3 |
0.99 | 0.974 | 0.941 | 0.796 | <0.01 | <0.01 | <0.01 | 0.084 | 0.99 | 0.99 | 0.99 | 0.99 | |
0.572 | 0.599 | 0.675 | 0.725 | <0.01 | 0.0759 | 0.3199 | 0.5174 | <0.01 | 0.013 | 0.151 | 0.46 | |
0.787 | 0.71 | 0.698 | 0.684 | 0.479 | 0.288 | 0.467 | 0.433 | 0.629 | 0.35 | 0.463 | 0.379 | |
<0.01 | <0.0164 | 0.0429 | 0.0402 | <0.01 | 0.0861 | 0.3989 | 0.4267 | <0.01 | <0.01 | 0.0166 | 0.017 | |
<0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | |
<0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | <0.01 | 0.0336 | 0.0315 |
PSS / SMG Threshold | Outcome | |||||||
---|---|---|---|---|---|---|---|---|
Model | Lags | Test | Boot. Critical Values | I(0) 5% | I(1) 5% | Statistic | Boot | Bound |
I | (1,1,0) | 3.730 | 4.070 | 5.190 | 9.758 | D1 | N | |
-2.020 | -2.860 | -3.530 | -2.338 | |||||
3.710 | 3.220 | 5.620 | 2.273 | |||||
II | (1,0,0) | 5.400 | 4.070 | 5.190 | 2.649 | N | U | |
-3.380 | -2.860 | -3.530 | -1.889 | |||||
5.630 | 3.220 | 5.620 | 3.481 | |||||
III | (1,0,0) | 5.360 | 4.070 | 5.190 | 6.716 | Y | Y | |
-3.550 | -2.860 | -3.530 | -4.202 | |||||
6.500 | 3.220 | 5.620 | 7.017 |
The bootCT package
allows the user to perform bootstrap cointegration tests in ARDL models
by overcoming the problem of inconclusive inference which is a
well-known drawback of standard bound tests. The package makes use of
different functions. The function boot_ardl
performs the bootstrap
tests, and it acts as a wrapper of both the bootstrap and the standard
bound tests, including also the Johansen test on the independent
variables of the model. Finally, it also performs the bound R
packages. The function sim_vecm_ardl
, which allows the simulation
of multivariate time series data following a user-defined DGP, enriches
the available procedures for multivariate data generation, while the
function lag_mts
provides a supporting tool in building datasets of
lagged variables for any practical purpose. Finally, the use of Rcpp
functions gives a technical advantage in terms of computational speed,
performing the bootstrap analysis within an acceptable time frame.
Expanding the matrix polynomial
where
By partitioning the vectors
According to (37), the long-run relationships of the VECM turn out to be now included in the matrix
To rule out the presence of long-run relationships between
After these algebraic transformations, the ARDL equation for
In light of the factorization (34) of the matrix
where
Bearing in mind that
Pesaran et al. (2001) introduced five different specifications for the ARDL
model, which depend on the deterministic components that can be absent
or restricted to the values they assume in the parent VAR model. In this
connection, note that, in light of (33), the drift and the
trend coefficient in the conditional VECM (37) are defined
as
Accordingly, after partitioning the mean and the drift vectors as
In the error correction term
The five specifications proposed by Pesaran et al. (2001) are
No intercept and no trend:
Restricted intercept and no trend:
Unrestricted intercept and no trend:
Unrestricted intercept, restricted trend:
Unrestricted intercept, unrestricted trend:
bootCT, dynamac, magrittr, gtools, pracma, Rcpp, RcppArmadillo, Rmisc, ARDL, aod, vars, urca, aTSA, tseries, reshape2, ggplot2, stringr, tidyverse, dplyr, ggplot
ChemPhys, Databases, DifferentialEquations, Econometrics, Environmetrics, Finance, HighPerformanceComputing, MixedModels, ModelDeployment, NumericalMathematics, Phylogenetics, Spatial, TeachingStatistics, TimeSeries
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
R
packages, either used in the creation of
bootCT or employed
in the analyses presented in this paper, are
magrittr
(Bache and Wickham 2022), gtools
(Bolker et al. 2022), pracma
(Borchers 2022), Rcpp
(Eddelbuettel 2013),
RcppArmadillo
(Eddelbuettel et al. 2023),
Rmisc (Hope 2022),
dynamac
(Jordan and Philips 2020), ARDL
(Natsiopoulos and Tzeremes 2021), aod
(Lesnoff et al. 2012), vars and
urca
(Pfaff 2008b; Pfaff 2008a),
aTSA (Qiu 2015),
tseries
(Trapletti and Hornik 2023),
reshape2,
ggplot2 and
stringr
(Wickham 2007, 2016, 2022),
tidyverse and
dplyr
(Wickham et al. 2019, 2023).[↩]Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Vacca, et al., "bootCT: An R Package for Bootstrap Cointegration Tests in ARDL Models", The R Journal, 2025
BibTeX citation
@article{RJ-2024-003, author = {Vacca, Gianmarco and Zoia, Maria and Bertelli, Stefano}, title = {bootCT: An R Package for Bootstrap Cointegration Tests in ARDL Models}, journal = {The R Journal}, year = {2025}, note = {https://doi.org/10.32614/RJ-2024-003}, doi = {10.32614/RJ-2024-003}, volume = {16}, issue = {1}, issn = {2073-4859}, pages = {39-66} }