bootCT: An R Package for Bootstrap Cointegration Tests in ARDL Models

Abstract:

The Autoregressive Distributed Lag approach to cointegration or bound testing, proposed by Pesaran in 2001, has become prominent in empirical research. Although this approach has many advantages over the classical cointegration tests, it is not exempt from drawbacks, such as possible inconclusive inference and distortion in size. Recently, Bertelli and coauthors developed a bootstrap approach to the bound tests to overcome these drawbacks. This paper introduces the R package bootCT, which implements this method by deriving the bootstrap versions of the bound tests and of the asymptotic F-test on the independent variables proposed by Sam and coauthors in 2019. As a spinoff, a general method for generating random multivariate time series following a given VECM/ARDL structure is provided in the package. Empirical applications showcase the main functionality of the package.

Cite PDF Supplement

Published

Jan. 10, 2025

Received

Jul 25, 2022

DOI

10.32614/RJ-2024-003

Volume

Pages

16/1

39 - 66


1 Introduction

Cointegration and error correction are fundamental concepts in the analysis of economic data, insofar as they provide an appropriate framework for testing economic hypotheses about growth and fluctuation. Several approaches have been proposed in the literature to determine whether two or more non-stationary time series are cointegrated, meaning they share a common long-run relationship.
There are two basic types of tests for cointegration: single equation tests and VAR-based tests. The former check the presence of unit roots in cointegration residuals or test the significance of the error-correction (EC) term coefficient . The latter, such as the approach, tackle the problem of detecting cointegrating relationships in a VAR model. This latter approach, albeit having the advantage of avoiding the issue of normalization, as well as allowing the detection of multiple cointegrating vectors, is far from being perfect. In the VAR system all variables are treated symmetrically, as opposed to the standard univariate models that usually have a clear interpretation in terms of exogenous and endogenous variables. Furthermore, in a VAR system all the variables are estimated at the same time, which is problematic if the relation between some variables is flawed, that is affected by some source of error. In this case a simultaneous estimation process tends to propagate the error affecting one equation to the others. Furthermore, a multidimensional VAR models employs plenty of degrees of freedom.
The recent cointegration approach, known as Autoregressive Distributed Lag (ARDL) approach to cointegration or bound testing, proposed by   (PSS), falls in the former strand of literature. It has become prominent in empirical research because it shows several advantages with respect to traditional methods for testing cointegration. First, it is applicable also in cases of mixed order integrated variables, albeit with integration not exceeding the first order. Thus, it evades the necessity of pre-testing the variables and, accordingly, avoids some common practices that may prevent finding cointegrating relationships, such as dropping variables or transforming them into stationary form  . Second, cointegration bound tests are performed in an ARDL model that allows different lag orders for each variable, thus providing a more flexible framework than other commonly employed approaches. Finally, unlike other cointegration techniques, which are sensitive to the sample size, the ARDL approach provides robust and consistent results for small sample sizes.
Notably, the ARDL bound testing methodology has quickly spread in economics and econometrics to study the cointegrating relationships between macroeconomic and financial variables, to evaluate the long-run impact of energy variables, or to assess recent environmental policies and their impact on the economy. Among the many applications, see for instance .
The original bound tests proposed by are an F-test for the significance of the coefficients of all lagged level variables entering the error correction term (Fov), and a t-test for the coefficient of the lagged dependent variable. When either the dependent or the independent variables do not appear in the long-run relationship, a degenerate case arises. The bound t-test provides answers on the occurrence of a degenerate case of second type, while the occurrence of a degeneracy case of first type can be assessed by testing whether the dependent variable is of integration order I(1). This type of check violates the spirit and motivation of the bound tests, which are supposed to be applicable in situations of unknown order of integration for the variables.
Recently, pointed out how, due to the low power problem of unit root tests, investigating the presence of a first type degeneracy by testing the integration order of the dependent variable may lead to incorrect conclusions. Therefore, they suggested checking for its occurrence by testing the significance of the lagged levels of the independent variables via an extra F-test (Find), which was also worked out in its asymptotic version [SMK; ].
Besides problems in testing the occurrence of degenerate cases, in general, the main drawback of the bound tests is the occurrence of potentially inconclusive results, if the test statistic lies between the bounds of the test distribution under the null. Furthermore, the asymptotic distributions of the statistics may provide a poor approximation of the true distributions in small samples. Finite sample critical values, even if only for a subset of all possible model specifications, have been worked out in the literature , while provided the quantiles of the asymptotic distributions of the tests as functions of the sample size, the lag order and the number of long-run forcing variables. However, this relevant improvement does not eliminate the uncertainty related to the inconclusive regions, or the existence of other critical issues related to the underlying assumptions of the bound test framework, such as the (weak) exogeneity of the independent variables or the non-stationarity of the dependent variable.
To overcome the mentioned bound test drawbacks, proposed bootstrapping the ARDL cointegration test. Inference can always be pursued with ARDL bootstrap tests, unlike what happens with both the PSS tests and the SMK test on the independent variables. Bootstrap ARDL tests were first put forward by in an unconditional ARDL model, which omits the instantaneous differences of the exogenous variables in the ARDL equation, rather than a conditional one, as originally proposed by . The unconditional model is often used, for reason of practical convenience, in empirical research. Simulation results in have highlighted the importance of employing the appropriate specification, especially under degenerate cases. In fact, it has been pointed out that a correct detection of these cases requires the comparison of the test outcomes in both the conditional and unconditional settings. Erroneous conclusions, based exclusively on one model specification, can thus be avoided.
In this paper, bootstrap bound tests, thereby including the bootstrap versions of the Fov, t and Find bound tests, are carried out in a conditional ARDL model setting. This approach allows to overcome the problem of inconclusive regions of the standard bound tests. A comparison with the outcomes engendered by the unconditional ARDL bootstrap tests is nevertheless provided for the Find test, to avoid erroneous inference in presence of degenerate cases.
The paper is organized as follows. Section 2 introduces the theoretical results of the ARDL cointegration bound tests. Section 3 details the steps carried out by the bootstrap procedure, which allows the construction of the (bootstrap) distribution - under the null - for the Fov, t, conditional Find and unconditional Find tests. Section 4 introduces the R package bootCT and its functionalities: a method for the generation of random multivariate time series that follow a user-specified VECM/ARDL structure, with some examples, and the main function that carries out the aforementioned bootstrap tests, while also computing the PSS and SMK bound tests. The trade-off between accuracy and computational time of the bootstrap procedure is also investigated, under several scenarios in terms of sample size and number of replications. Notably, a function that performs the PSS bound tests is already available in the dynamac package , while no R routine has so far been implemented for the SMK test, to the best of our knowledge. Section 5 gives some empirical applications that employ the core function of the package and its possible outputs. Section 6 concludes. Appendix 7 briefly delves into technical details of the conditional ARDL model and its possible specifications The R packages, either used in the creation of bootCT or employed in the analyses presented in this paper, are magrittr , gtools , pracma , Rcpp , RcppArmadillo , Rmisc , dynamac , ARDL , aod , vars and urca , aTSA , tseries , reshape2, ggplot2 and stringr , tidyverse and dplyr ..

2 Cointegration bound tests in ARDL models

The starting point of the approach proposed by   is a (K+1) VAR(p) model (1)A(L)(ztμηt)=εtεtN(0,Σ),A(L)=(IK+1j=1pAjLj)t=1,2,,T. Here, Aj are square (K+1) matrices, zt a vector of (K+1) variables, μ and η are (K+1) vectors representing the drift and the trend respectively, and det(A(z))=0 for |z|1. If the matrix A(1)=IK+1j=1pAj is singular, the components of zt turn out to be integrated and possibly cointegrated.
The VECM representation of (1) is given by (see Appendix 7.1 for details) (2)Δzt=α0+α1tA(1)zt1+j=1p1ΓjΔztj+εt. Now, to study the adjustment to the equilibrium of a single variable yt, given the other xt variables, the vectors zt and εt are partitioned (3)zt=[yt(1,1)xt(K,1)],εt=[εyt(1,1)εxt(K,1)]. The matrix A(1), which is assumed to be singular to allow cointegration, is partitioned conformably to zt as If the explanatory variables are stationary Axx is non-singular (rk(Axx)=K), while when they are integrated but without cointegrating relationship Axx is a null matrix.

A(1)=[ayy(1,1)ayx(1,K)axy(K,1)Axx(K,K)]. Under the assumption (4)εtN(0,[σyy(1,1)σyx(1,K)σxy(K,1)Σxx(K,K)]), the following holds (5)εyt=ωεxt+νytN(0,σy.x), where σy.x=σyyωσxy with ω=σyxΣxx1, and νyt is independent of εxt.
Substituting (5) into (2) and assuming that the xt variables are exogenous towards the ARDL parameters (that is, setting axy=0 in A(1)) yields the system (see Appendix 7.1 for details) (6)Δyt=α0.y+α1.ytayyECt1+j=1p1γy.x,jΔztj+ωΔxt+νyt

(7)Δxt=α0x+α1xt+A(x)zt1+Γ(x)(L)Δzt+εxt, where (8)γy.x,j=γy,jωΓ(x),j

(9)α0.y=α0yωα0x,α1.y=α1yωα1x, and where the error correction term, ECt1, expressing the long-run equilibrium relationship between yt and xt, is given by (10)ECt1=yt1θ0θ1tθxt1, with (11)θ0=μyθμx,θ1=ηyθηx,θ=a~y.xayy=ayxωAxxayy. Thus, no cointegration occurs when a~y.x=0 or ayy=0 . These two circumstances are referred to as degenerate case of second and first type, respectively. Degenerate cases imply no cointegration between yt and xt.
To test the hypothesis of cointegration between yt and xt, proposed an F-test, Fov hereafter, based on the hypothesis system H0:ayy=0a~y.x=0(12)H1:ayy0a~y.x0.
Note that H1 covers also the degenerate cases H1y.x:ayy=0,a~y.x0(13)H1yy:ayy0,a~y.x=0.
The exact distribution of the F statistic under the null is unknown, but it is limited from above and below by two asymptotic distributions: one corresponding to the case of stationary regressors, and another corresponding to the case of first-order integrated regressors. As a consequence, the test is called bound test and has an inconclusive area. The knowledge of the rank of the cointegrating matrix is necessary to overcome this impasse.
  worked out two sets of (asymptotic) critical values: one, {τL,F}, for the case when xtI(0) and another, {τU,F}, for the case when xtI(1). These values vary in accordance with the number of regressors in the ARDL equation, the sample size and the assumptions made about the deterministic components (intercept and trend) of the data generating process.
In this regard,   introduced five different specifications for the ARDL model, depending on its deterministic components, which are (see Appendix 7.2 for details)

  1. No intercept and no trend (14)Δyt=ayyECt1+j=1p1γy.x,jΔztj+ωΔxt+νyt,
    where ECt1=yt1θxt1,

  2. Restricted intercept and no trend (15)Δyt=ayyECt1+j=1p1γy.x,jΔztj+ωΔxt+νyt,
    where ECt1=yt1θ0θxt1. The intercept extracted from the EC term is α0.yEC=ayyθ0.

  3. Unrestricted intercept and no trend (16)Δyt=α0.yayyECt1+j=1p1γy.x,jΔztj+ωΔxt+νyt,
    where ECt1=yt1θxt1.

  4. Unrestricted intercept, restricted trend (17)Δyt=α0.yayyECt1+j=1p1γy.x,jΔztj+ωΔxt+νyt,
    where ECt1=yt1θ1tθxt1. The trend extracted from the EC term is α1.yEC=ayyθ1.

  5. Unrestricted intercept, unrestricted trend (18)Δyt=α0.y+α1.ytayyECt1+j=1p1γy.x,jΔztj+ωΔxt+νyt,
    where ECt1=yt1θxt1.

The model in (6) proposed by   represents the correct framework in which to carry out bound tests. However, bound test are often performed in an unconditional ARDL model setting, specified as (19)Δyt=α0.y+α1.ytayyECt1+j=1p1γjΔztj+εyt, which omits the term ωΔxt.
have highlighted that bootstrap tests performed in these two ARDL specifications can lead to contrasting results. To explain this divergence, note that the conditional model makes use of the following vector in the EC term a~y.x=ayxωAxx (divided by ayy, see (11)) to carry out bound tests, while the unconditional one only uses the vector ayx, (divided by ayy), since it neglects the term ωAxx. The latter is introduced in the ARDL equation by the operation of conditioning yt on the other variables xt of the model This can lead to contrasting inference in two instances. The first happens when a degeneracy of first type occurs in the conditional model, that is (20)a~y.x=0, because ayx=ωAxx. In this case, the conditional model rejects cointegration, while the unconditional one concludes the opposite. The other case happens when a degeneracy of first type occurs in the unconditional model, that is (21)ayx=0, but a~y.x=ωAxx0. In this case, the unconditional model rejects cointegration, while the conditional one concludes for the existence of cointegrating relationships, which are however spurious. Only a comparison of the outcomes of the Find test performed in both the conditional and unconditional ARDL equation can help to disentangle this problem. In fact, as ωAxxxtI(0), the conclusion that ytI(0) must hold. This in turn entails that no cointegration occurs between yt and xt.
In the following, bootstrap tests are carried out in the conditional ARDL model (6). However, when a degeneracy of first type occurs in the unconditional model, the outcomes of the Find bootstrap test performed in both the conditional and unconditional settings are provided. This, as previously outlined, is performed to avoid the acceptance of spurious long-run relationships among the dependent variable and the independent variables.

3 The new bootstrap procedure

The bootstrap procedure here proposed focuses on a ARDL model specified as in (14)-(18), depending on the assumptions on the deterministic components.
The bootstrap procedure consists of the following steps:

  1. The ARDL model is estimated via OLS and the related test statistics Fov, t or Find are computed.

  2. In order to construct the distribution of each test statistic under the corresponding null, the same model is re-estimated imposing the appropriate restrictions on the coefficients according to the test under consideration.

  3. Following , the ARDL restricted residuals are then computed. For example, under Case III, the residuals are (22)ν^ytFov=Δytα^0.yj=1p1γ^y.x,jΔztjω^Δxt

    (23)ν^ytt=Δytα^0.y+a~^y.xxt1j=1p1γ^y.x,jΔztjω^Δxt

    (24)ν^ytFind=Δytα^0.y+a^yyyt1j=1p1γ^y.x,jΔztjω^Δxt. Here, the apex ".^" denotes the estimated parameters. The other cases can be dealt with in a similar manner.

  4. The VECM model

    (25)Δzt=α0Azt1+j=1p1ΓjΔztj+εt is estimated as well (imposing weak exogeneity), and the residuals

    (26)ε^xt=Δxtα^0x+A^xxxt1j=1p1Γ^(x)jΔztj are computed. Thsis approach guarantees that the residuals ε^xt, associated to the variables xt explained by the marginal model (7), are uncorrelated with the ARDL residuals ν^yt..

  5. A large set of B bootstrap replicates are sampled from the residuals calculated as in (22),(23), (24) and (26). In each replication, the following operations are carried out:

    1. Each set of (Tp) resampled residuals (with replacement) ν^zt(b)=(ν^yt(b),ε^xt(b)) is re-centered (27)ν^˙yt(b)=ν^yt(b)1Tpt=p+1Tν^yt(b)(28)ε^˙xitb=ε^xit(b)1Tpt=p+1Tε^xit(b)i=1,,K.

    2. A sequential set of (Tp) bootstrap observations, yt,xtt=p+1,,T, is generated as follows yt=yt1+Δyt,xt=xt1+Δxt, where Δxt are obtained from (26) and Δyt from either (22), (23) or (24) after replacing in each of these equations the original residuals with the bootstrap ones.
      The initial conditions, that is the observations before t=p+1, are obtained by drawing randomly p observations in block from the original data, so as to preserve the data dependence structure.

    3. An unrestricted ARDL model is estimated via OLS using the bootstrap observations, and the statistics Fov(b),H0, t(b),H0 Find(b),H0 are computed.

  6. The bootstrap distributions of {Fov(b),H0}b=1B, {Find(b),H0}b=1B and {t(b),H0}b=1B under the null are then employed to determine the critical values of the tests. By denoting with Mb the ordered bootstrap test statistic, and with α the nominal significance level, the bootstrap critical values are determined as follows (29)cα,M=min{c:b=1B1{Mb>c}α}M{Fov,Find} for the F tests and (30)cα,t=max{c:b=1B1{tb<c}α} for the t test.
    Here, 1{xA} is the indicator function, which is equal to one if the condition in subscript is satisfied and zero otherwise.

The null hypothesis is rejected if the F statistic computed at step 1, Fov or Find, is greater than the respective cα,M, or if the t statistic computed at the same step is lower than cα,t.

4 Illustration of the bootCT package

This section describes the main functionalities of the bootCT package. The functions included in the package are essentially of two types. The function sim_vecm_ardl generates data according to a given data generating process (DGP), assuming either the presence or the absence of cointegrating relationships between variables, or degenerate cases. The function boot_ardl tests the presence of cointegrating relationships employing the Pesaran ARDL bound tests (Fov and t), the SMK bound test on lagged independent variables (Find), and the novel ARDL bootstrap testing procedure.

Generating a multivariate time series: the sim_vecm_ardl function

The function sim_vecm_ardl allows to simulate a multivariate time series from a given conditional ARDL specification for a dependent variable yt and a VAR/VECM specification for the remaining independent variables xt. In sthis regard, it represents an interesting addition to extant data generating procedures for VAR/VECM models. The arguments of this function can be divided into two subgroups.
A group of parameters pertains the VECM model (6) and (7), with Axx identifying the matrix of the long-run relationships among the xt variables, and Γj’s, j=1,...,p1 the short-run matrices of the system variables. Additionally, the parameter ayy weighs the EC term for yt, while ayx is the parameter vector weighting the variables xt in the ARDL equation. The vector ayx, after conditioning yt on the other variables (xt, see model (6)) becomes a~y.x=ayxωAxx.
The second group of parameters concerns the model intercept and trend of the VAR specification, μ and η, which in the VECM representation become α0=Aμ+(IK+1i=1p1ΓjA)η and α1=Aη and in the conditional ARDL become α0.yEC=ayy(μya~y.xμx)+γy.x(1)η and a1.yEC=ayy(ηya~y.xηx). As explained in Appendix 7.2, intercept and trend appear in the error correction (EC) term of the ARDL equation only when restricted. Accordingly, they both do not appear in the EC in the case I, the intercept does not appear in the EC term in cases III, IV and V (it is freely set to α0.y) while the trend appears in the EC term only in the case IV (it is freely set to α1.y for case V). Accordingly, when these terms are not restricted, they need to be supplied by the user.
The approach used to specify the function inputs offers great control to the user, in terms of generating specific (conditional) ARDL-based cointegration structures.
The function sim_vecm_ardl takes the following arguments:

If parameter values for mu.in, eta.in, azero.in, or aone.in and case number turn out to be in contradiction, an error message is displayed.
As output, the function gives out a list containing the data, both in level and first difference, along with all the parameter values given as input. Additionally, all intermediate transformation of parameters via VECM transformation or as a by-product of conditioning yt on xt are included in the output.
Figure 1 depicts three-time series, dep_1_0, ind_1_0 and ind_2_0, generated using this function and affected by a cointegrating relationship, one panel for each case, from I to V. The variable dep_1_0 represents the dependent variable yt of the ARDL equation, while ind_1_0 and ind_2_0 the independent ones, x1t and x2t.
The code used to generate the data for case I is the following:

    corrm = matrix(c(   0,     0, 0,
                     0.25,     0, 0,
                      0.4, -0.25, 0), nrow = 3, ncol = 3, byrow = T)

    Corrm = (corrm + t(corrm)) + diag(3)

    sds = diag(c(1.3, 1.2, 1))

    sigma.in = (sds %*% Corrm %*% t(sds))

    gamma1 = matrix(c(0.6,    0, 0.2,
                      0.1, -0.3,   0,
                        0, -0.3, 0.2), nrow = 3, ncol = 3,byrow=T)
    gamma2= gamma1 * 0.3

    omegat = sigma.in[1, -1] %*% solve(sigma.in[-1, -1])
    axx.in = matrix(c( 0.3, 0.5,
                      -0.4, 0.3), nrow = 2, ncol = 2, byrow = T)
    ayx.uc.in = c(0.4, 0.4)
    ayy.in = 0.6

    data.vecm.ardl_1 =
    sim_vecm_ardl(nobs = 200,
                  case = 1,
                  sigma.in = sigma.in,
                  gamma.in = list(gamma1, gamma2),
                  axx.in = axx.in,
                  ayx.uc.in = ayx.uc.in,
                  ayy.in = ayy.in,
                  mu.in = rep(0, 3),
                  eta.in = rep(0, 3),
                  azero.in = rep(0, 3),
                  aone.in = rep(0, 3),
                  burn.in = 100,
                  seed.in = 999)

Additionally, Figure 2 displays other three time series, dep_1_0 (yt), ind_1_0 (x1t) and ind_2_0 (x2t), when a degeneracy of second type occurs (ayy=0) in the long-run relationship in the ARDL equation of dep_1_0 on ind_1_0, ind_2_0. The five panels represents the behavior of these series in the Cases from I to V. It is worth noting the different scenario implied by these cases: case III depicts a trend for the yt variable, case IV highlights the inclusion of a trend in the cointegrating relationship, and case V exhibits a quadratic trend in the yt variable.
Finally, the flowchart in Figure 3 details the internal steps of the function sim_vecm_ardl and the data generation workflow. There, it is specified how the parameters of the VAR, VECM and ARDL equation are introduced. Attention is paid on whether the error correction mechanism involves either intercept or trend (or both) via the internal computation of the parameters θ0 and θ1 (and thus α0.yEC and α1.yEC). When the EC term does not involve intercept and/or trend, α0 and α1 are supplied by the user, depending on the case under study.

graphic without alt text

Figure 1: Simulated data from the VECM / conditional ARDL specifications, for every case. Made with ggplot .

graphic without alt text

Figure 2: Simulated data from the VECM / conditional ARDL specifications (degenerate case of type 2, a_{yy}=0), for every case. Made with ggplot.

Figure 3: Flowchart of the sim_vecm_ardl function inner steps. When applying (7) and (8), ytj=0,Δytj=0,xtj=0,Δxtj=0 for any tj<1. Boxes denote parameter definitions and transformations. Circles denote crucial actions, Empty nodes denote function inputs.

Bootstrapping the ARDL bound tests: the boot_ardl function

This function develops the bootstrap procedure detailed previously. As an option in the initial estimation phase, it offers the possibility of automatically choosing the best order for the lagged differences of all the variables in the ARDL and VECM models. This is done by using several criteria. In particular, AIC, BIC, AICc, R2 and Radj2 are used as lag selection criteria for the ARDL model, while the overall minimum between AIC, HQIC, SC and FPE is used for the lag selection for the VECM.
In particular, the auto_ardl function in the package ARDL selects the best ARDL order in terms of the short-run parameter vectors γy.x,j, while the VARselect function in the package vars selects the best VECM order in terms of the short-run parameter matrices Γ(x),j. Furthermore, the user can input a significance threshold for the retention of single parameters in the Γj and in the γy.x,j vectors.
The function boot_ardl takes the following arguments:

boot_ardl makes use of the lag_mts function which produces lagged versions of a given matrix of time series, each column with a separate order. lag_mts takes as parameters the data included in a matrix X and the lag orders in a vector k, with the addition of a boolean parameter last.only, which allows to specify whether only the k-th order lags have to be retained, or all the lag orders from the first to the k-th.
boot_ardl also acts as a wrapper for the most common methodologies detecting cointegration, offering a comprehensive view on the testing procedures involved in the analysis. The resulting object, of class bootCT, contains all the information about

Internally, the bootstrap data generation under the null is executed via a Rcpp function, employing the Rcpp and RcppArmadillo packages , so as to greatly speed up computational times. As explained in the previous section, cointegration tests in the unconditional ARDL model are performed in order to uncover the presence of spurious cointegrating relationships.
To this end, the function provides

A summary method has been implemented to present the results in a visually clear manner. It accepts the additional argument "out" that lets the user choose which output(s) to visualize: ARDL prints the conditional ARDL model summary, VECM prints the VECM model summary, cointARDL prints the summary of the bound tests and the bootstrap tests, cointVECM prints the summary of the Johansen test on the independent variables.
A detailed flowchart showing the function’s workflow is displayed in Figure 4. There, the expressions "C ARDL" and "UC ARDL" stand for conditional and unconditional ARDL model, respectively.

Figure 4: Flowchart of the boot_ardl function inner steps. Boxes denote parameter definitions and transformations. Diamonds denote function outputs. Dashed diamonds denote intermediate output (not shown after function call). Empty nodes denote function inputs. The first p+1 rows of zt(b) are set equal to the first p+1 rows of the original data. The best lag order for each difference variable in the ARDL model is determined via auto_ardl(). It is reported as a unique value p in γy.x,j for brevity in the flowchart.

Execution time and technical remarks

In order to investigate the sensitivity of the procedure to different sample sizes and number of bootstrap replicates, an experiment has been run using a three-dimensional time series of length T={50,80,100,200,500}, generating 100 datasets for each sample size with the sim_vecm_ardl function (Case II, with cointegrated variables, and 2 lags in the short-run section of the model).
Then, the boot_ardl function has been called

boot_ardl(data = df_sim,
          nboot = bootr,
          case = 2,
          fix.ardl = rep(2, 3),
          fix.vecm = 2)

In the code above, bootr has been set equal to B={200,500,1000,2000}, the number of lags has been assumed known (fix.ardl and fix.vecm), while default values have been used for every other argument (such as a.ardl, a.vecm and a.boot.H0).
Table 1 shows the average running time per replication together with the coefficient of variation (%) of the bootstrap critical values of the Fov test, for each value of T and B, across 100 replications for each scenario.
Naturally, the running time increases as both sample size and bootstrap replicates increase. However, it can be noticed how the coefficients of variation tend to stabilize for B1000, especially for T>80, at the 5% significance level. Therefore, it is recommended a number of bootstrap replicates of at least B=1000 for higher sample size, or at least B=2000 for smaller samples. The analysis has been carried out using an Intel(R) Core(TM) i7-1165G7 CPU @ 2.80GHz processor, 16GB of RAM.

T B Exec. Time (sec) cv(Fov)(5%) cv(Fov)(2.5%) cv(Fov)(1%)
50 200 23.38 8.648 10.925 13.392
50 500 48.37 6.312 6.952 8.640
50 1000 96.65 4.806 5.613 6.288
50 2000 231.15 4.255 4.226 4.946
80 200 23.46 7.251 8.936 11.263
80 500 50.19 4.998 6.220 7.946
80 1000 143.00 3.882 4.453 5.305
80 2000 255.64 2.912 3.623 4.518
100 200 37.89 7.707 8.583 10.955
100 500 52.86 4.691 5.304 7.557
100 1000 184.51 3.512 4.567 5.695
100 2000 212.65 3.519 3.674 4.185
200 200 35.46 6.644 7.173 10.365
200 500 76.78 4.734 5.355 6.225
200 1000 148.25 3.124 4.177 5.034
200 2000 484.51 2.811 3.361 3.907
500 200 54.47 6.641 8.694 10.414
500 500 133.17 5.137 5.816 6.408
500 1000 271.87 3.905 4.585 5.283
500 2000 561.71 3.221 3.490 4.145

Table 1: Average execution times (in seconds) of the boot_ardl function, for different combinations of sample size T and bootstrap replicates B. Coefficients of variation (cv) reported for the Fov bootstrap critical values at level 5%, 2.5% and 1%.

5 Empirical applications

This section provides two illustrative application which highlight the performance of the bootstrap ARDL tests.

An application to the German macroeconomic dataset

In the first example, the occurrence of a long-run relationship between consumption [C], income [INC], and investment [INV] of Germany has been investigated via a set of ARDL models, where each variable takes in turn the role of dependent one, while the remaining are employed as independent. The models have been estimated by employing the dataset of which includes quarterly data of the series over the years 1960 to 1982. The data have been employed in logarithmic form. Figure 5 displays these series over the sample period.
Before applying the bootstrap procedure, the order of integration of each series has been analyzed. Table 2 shows the results of ADF test performed on both the series and their first-differences (k=3 maximum lags). The results confirm the applicability of the ARDL framework as no series is integrated of order higher than one.
The following ARDL equations have been estimated:

  1. First ARDL equation (C | INC, INV): ΔlogCt=α0.yayylogCt1ay.x1logINCt1ay.x2logINVt1+j=1p1γy.jΔlogCtj+j=1s1γx1.jΔlogINCtj+j=1r1γx2.jΔlogINVtj+ω1ΔlogINCt+ω2ΔlogINVt+νt.

  2. Second ARDL equation (INC | C, INV): ΔlogINCt=α0.yayylogINCt1ay.x1logCt1ay.x2logINVt1+j=1p1γy.jΔlogINCtj+j=1s1γx1.jΔlogCtj+j=1r1γx2.jΔlogINVtj+ω1ΔlogCt+ω2ΔlogINVt+νt.

  3. Third ARDL equation (INV | C, INC): ΔlogINVt=α0.yayylogINVt1ay.x1logCt1ay.x2logINCt1+j=1p1γy.jΔlogINVtj+j=1s1γx1.jΔlogCtj+j=1r1γx2.jΔlogINCtj+ω1ΔlogCt+ω2ΔlogINCt+νt.

Table 3 shows the estimation results for each ARDL and VECM model. It is worth noting that the instantaneous difference of the independent variables are highly significant in each conditional ARDL model. Thus, neglecting these variables in the ARDL equation, as happens in the unconditional version of the model, may potentially lead to biased estimates and incorrect inference. For the sake of completeness, also the results of the marginal VECM estimation are reported for each model.
The code to prepare the data, available in the package as the ger_macro dataset, is:

    data("ger_macro")
    LNDATA = apply(ger_macro[,-1], 2, log)
    col_ln = paste0("LN", colnames(ger_macro)[-1])
    LNDATA = as.data.frame(LNDATA)
    colnames(LNDATA) = col_ln

Then, the boot_ardl function is called, to perform the bootstrap tests. In the code chunk below, Model I is considered.

    set.seed(999)
    BCT_res_CONS = boot_ardl(data = LNDATA,
                         yvar = "LNCONS",
                         xvar = c("LNINCOME", "LNINVEST"),
                         maxlag = 5,
                         a.ardl = 0.1,
                         a.vecm = 0.1,
                         nboot = 2000,
                         case = 3,
                         a.boot.H0 = c(0.05),
                         print = T)

to which follows the call to the summary function

    summary(BCT_res_CONS, out = "ARDL")
    summary(BCT_res_CONS, out = "VECM")
    summary(BCT_res_CONS, out = "cointVECM")
    summary(BCT_res_CONS, out = "cointARDL")

The first summary line displays the output in the ARDL column of Table 3 and the second column of Table 4, Model I. The second line corresponds to the VECM columns of Table 3, Model I - only for the independent variables. The information on the rank of the Axx in Table 3 is inferred from the third line. Finally, the fourth summary line corresponds to the test results in Table 4, Model I. A textual indication of the presence of spurious cointegration is displayed at the bottom of the "cointARDL" summary, if detected.
In this example, the bootstrap and bound testing procedures are in agreement only for model I, indicating the existence of a cointegrating relationship. Additionally, no spurious cointegration is detected for this model. As for models II and III, the null hypothesis is not rejected by the bootstrap tests, while the PSS and SMG bound tests fail to give a conclusive answer in the Find test.
The running time of the entire analysis is of roughly 11 minutes, using an Intel(R) Core(TM) i7-1165G7 CPU @ 2.80GHz processor, 16GB of RAM.

Table 2: ADF preliminary test (null hypothesis: random walk with drift).
level variable first difference
Series lag ADF p.value ADF p-value
logCt 0 -1.690 0.450 -9.750 <0.01
1 -1.860 0.385 -5.190 <0.01
2 -1.420 0.549 -3.130 0.030
3 -1.010 0.691 -2.720 0.080
logINCt 0 -2.290 0.217 -11.140 <0.01
1 -1.960 0.345 -7.510 <0.01
2 -1.490 0.524 -5.120 <0.01
3 -1.310 0.587 -3.290 0.020
logINVt 0 -1.200 0.625 -8.390 <0.01
1 -1.370 0.565 -5.570 <0.01
2 -1.360 0.570 -3.300 0.020
3 -1.220 0.619 -3.100 0.032
graphic without alt text

Figure 5: log-consumption/investment/income graphs (level variables and first differences). Made with ggplot.

Table 3: Conditional ARDL and VECM results for the consumption/income/investment dataset, along with rank of the Axx matrix via the Johansen (J) test.
Significance codes: (***) 1%; (**) 5%; (.) 10%.
Model I Model II Model III
ARDL VECM ARDL VECM ARDL VECM
ΔlogCt ΔlogINVt ΔlogINCt ΔlogINCt ΔlogCt ΔlogINVt ΔlogINVt ΔlogCt ΔlogINCt
logCt1 -0.307 *** (0.055) 0.168 * (0.081) -0.0011 (0.0126) 0.1286 * (0.0540) 0.611 . (0.339) -0.2727 *** (0.0704) -0.0508 (0.0796)
logINCt1 0.297 *** (0.055) 0.124 * (0.054) -0.017 (0.014) -0.183 * (0.079) -0.491 (0.340) 0.2619 *** (0.0681) 0.0464 (0.0772)
logINVt1 -0.001 (0.011) -0.152 * (0.063) 0.016 (0.017) 0.0209 (0.0135) -0.00107 (0.0142) -0.1531 * (0.0607) -0.1212 * (0.060)
ΔlogCt1 -0.248 ** (0.079) 0.899 * (0.442) 0.211 . (0.113) 0.375 *** (0.1086) 0.9288 * (0.442) 1.113 * (0.441) 0.2072 . (0.1142)
ΔlogCt2 0.744 (0.431) 0.8049 . (0.4345)
ΔlogINCt1 -0.1404 (0.1095)
ΔlogINCt2 0.2675 ** (0.0958) 0.1522 . (0.0912)
ΔlogINVt1 -0.18 (0.111) 0.035 (0.029) -0.189 . (0.1097) -0.175 (0.1075) 0.0479 . (0.0282)
ΔlogINVt2 0.049 . (0.027) 0.0591 * (0.0245) 0.0578 * (0.0223) 0.0562 * (0.0266)
ΔlogCt 0.7070 *** (0.1093) 1.8540 *** (0.5425)
ΔlogINCt 0.471 *** (0.074) -0.445 *** (0.4726)
ΔlogINVt 0.065 ** (0.019) -0.0230 (0.025)
const. 0.048 *** (0.013) 0.036 (0.066) 0.033 * (0.017) 0.002 (0.018) 0.0266 . (0.0155) 0.023 (0.0666) -0.056 (0.072) 0.0517 ** (0.0157) 0.0378 * (0.0177)
J-test rk(Axx)=2 rk(Axx)=2 rk(Axx)=2
Table 4: Cointegration analysis for the three ARDL equations in the German macroeconomic data. The optimal number of ARDL lags in the short-run - in the form (y,x1,x2), matching the model definition - bootstrap critical values, bound test thresholds and test statistics for each test are shown (case III).
The outcome columns draw conclusions on each type of model (bootstrap or bound): Y = cointegrated, N = not cointegrated, D1 = degenerate of type 1, D2 = degenerate of type 2, U = inconclusive inference.
PSS / SMG Threshold Outcome
Model Lags Test Boot. Critical Values I(0) 5% I(1) 5% Statistic Boot Bound
I (1,0,0) Fov 3.79 3.79 4.85 10.75 Y Y
t -2.88 -2.86 -3.53 -5.608
Find 4.92 3.01 5.42 15.636
II (1,1,0) Fov 5.79 3.79 4.85 2.867 N U
t -3.69 -2.86 -3.53 -2.315
Find 7.38 3.01 5.42 3.308
III (1,1,0) Fov 5.50 3.79 4.85 3.013 N U
t -3.32 -2.86 -3.53 -2.020
Find 6.63 3.01 5.42 4.189

An application on Italian Macroeconomic Data

Following , the relationship between foreign direct investment [FDI], exports [EXP], and gross domestic product [GDP] in Italy is investigated. The data of these three yearly variables have been retrieved from the World Bank Database and cover the period from 1970 to 2020. In the analysis, the log of the variables has been used and [EXP] and [FDI] have been adjusted using the GDP deflator. Figure 6 displays these series over the sample period.

graphic without alt text

Figure 6: log-GDP/export/investment graphs (level variables and first differences). Made with ggplot.

Table 5 shows the outcomes of the ADF test performed on each variable, which ensures that the integration order is not higher than one for all variables. Table 6 shows the results of bound and bootstrap tests performed in ARDL model by taking each variable, in turn, as the dependent one. The following ARDL equations have been estimated:

  1. First ARDL equation (GDP | EXP, FDI): ΔlogGDPt=α0.yayylogGDPt1ay.x1logEXPt1ay.x2logFDIt1+j=1p1γy.jΔlogGDPtj+j=1s1γx1.jΔlogEXPtj+j=1r1γx2.jΔlogFDItj+ω1ΔlogEXPt+ω2ΔlogFDIt+νt. For this model, a degenerate case of the first type can be observed, while the simpler bound testing procedure does not signal cointegration.

  2. Second ARDL equation (EXP | GDP, FDI): ΔlogEXPt=α0.yayylogEXPt1ay.x1logGDPt1ay.x2logFDIt1+j=1p1γy.jΔlogEXPtj+j=1s1γx1.jΔlogGDPtj+j=1r1γx2.jΔlogFDItj+ω1ΔlogGDPt+ω2ΔlogFDIt+νt. For this model, the ARDL bootstrap test indicates absence of cointegration, while the bound testing approach is inconclusive for the Find test.

  3. Third ARDL equation (FDI | GDP, EXP): ΔlogFDIt=α0.yayylogFDIt1ay.x1logGDPt1ay.x2logEXPt1+j=1p1γy.jΔlogFDItj+j=1s1γx1.jΔlogGDPtj+j=1r1γx2.jΔlogEXPtj+ω1ΔlogGDPt+ω2ΔlogEXPt+νt. For this model, the long-run cointegrating relationship is confirmed using both boostrap and bound testing. No spurious cointegration is detected.

The code to load the data and perform the analysis (e.g. for Model I) is:

    data("ita_macro")
    BCT_res_GDP = boot_ardl(data = ita_macro,
                         yvar = "LGDP",
                         xvar = c("LEXP", "LFI"),
                         maxlag = 5,
                         a.ardl = 0.1,
                         a.vecm = 0.1,
                         nboot = 2000,
                         case = 3,
                         a.boot.H0 = c(0.05),
                         print = T)

For the sake of simplicity, the conditional ARDL and VECM marginal models outputs included in each cointegrating analysis is omitted. The summary for the cointegration tests for Model I is called via

    summary(BCT_res_GDP, out = "ARDL") # extract lags
    summary(BCT_res_GDP, out ="cointARDL") # ARDL cointegration

This empirical application further highlights the importance of dealing with inconclusive inference via the bootstrap procedure, while naturally including the effect of conditioning in the ARDL model, as highlighted in .

Table 5: ADF preliminary test for the second example.
No Drift, No Trend Drift, No Trend Drift and Trend
Variable Lag = 0 Lag = 1 Lag = 2 Lag = 3 Lag = 0 Lag = 1 Lag = 2 Lag = 3 Lag = 0 Lag = 1 Lag = 2 Lag = 3
logGDPt 0.99 0.974 0.941 0.796 <0.01 <0.01 <0.01 0.084 0.99 0.99 0.99 0.99
logFDIt 0.572 0.599 0.675 0.725 <0.01 0.0759 0.3199 0.5174 <0.01 0.013 0.151 0.46
logEXPt 0.787 0.71 0.698 0.684 0.479 0.288 0.467 0.433 0.629 0.35 0.463 0.379
ΔlogGDPt <0.01 <0.0164 0.0429 0.0402 <0.01 0.0861 0.3989 0.4267 <0.01 <0.01 0.0166 0.017
ΔlogFDIt <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01
ΔlogEXPt <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 0.0336 0.0315
Table 6: Cointegration analysis for the three ARDL equations in the Italian macroeconomic data. The optimal number of ARDL lags in the short-run - in the form (y,x1,x2), matching the model definition - bootstrap critical values, bound test thresholds and test statistics for each test are shown (case III).
The outcome columns draw conclusions on each type of model (bootstrap or bound): Y = cointegrated, N = not cointegrated, D1 = degenerate of type 1, D2 = degenerate of type 2, U = inconclusive inference.
PSS / SMG Threshold Outcome
Model Lags Test Boot. Critical Values I(0) 5% I(1) 5% Statistic Boot Bound
I (1,1,0) Fov 3.730 4.070 5.190 9.758 D1 N
t -2.020 -2.860 -3.530 -2.338
Find 3.710 3.220 5.620 2.273
II (1,0,0) Fov 5.400 4.070 5.190 2.649 N U
t -3.380 -2.860 -3.530 -1.889
Find 5.630 3.220 5.620 3.481
III (1,0,0) Fov 5.360 4.070 5.190 6.716 Y Y
t -3.550 -2.860 -3.530 -4.202
Find 6.500 3.220 5.620 7.017

6 Conclusion

The bootCT package allows the user to perform bootstrap cointegration tests in ARDL models by overcoming the problem of inconclusive inference which is a well-known drawback of standard bound tests. The package makes use of different functions. The function boot_ardl performs the bootstrap tests, and it acts as a wrapper of both the bootstrap and the standard bound tests, including also the Johansen test on the independent variables of the model. Finally, it also performs the bound F-test on the lagged independent variables, so far not available in other extant R packages. The function sim_vecm_ardl, which allows the simulation of multivariate time series data following a user-defined DGP, enriches the available procedures for multivariate data generation, while the function lag_mts provides a supporting tool in building datasets of lagged variables for any practical purpose. Finally, the use of Rcpp functions gives a technical advantage in terms of computational speed, performing the bootstrap analysis within an acceptable time frame.

7 Appendix

Section A - the methodological framework of (conditional) VECM and ARDL models

Expanding the matrix polynomial A(z) about z=1, yields (31)A(z)=A(1)z+(1z)Γ(z), where A(1)=IK+1j=1pAj

(32)Γ(z)=IK+1i=1p1Γizi,Γi=j=i+1pAj. The VECM model (2) follows accordingly, and

(33)α0=A(1)μ+(Γ(1)A(1))η,α1=A(1)η. Assuming that A(1) is singular and that the variables xt are cointegrated. This entails the following A(1)=[ayy(1,1)ayx(1,K)axy(K,1)Axx(K,K)]=B(K+1,r+1)C(r+1,K+1)=[byybyxbxyBxx][cyycyxcxyCxx]=(34)=[byycyy+byxcxybyycyx+byxCxxbxycyy+Bxxcxybxycyx+Axx],rk(A(1))=rk(B)=rk(C),
where B and C are full column rank matrices arising from the rank-factorization of A(1)=BC with C matrix of the long-run relationships of the process and Bxx, Cxx arising from the rank factorization of Axx=BxxCxx, with rk(Axx)=rk(Bxx)=rk(Cxx)=r If the explanatory variables are stationary Axx is non-singular (rk(Axx)=K), while when they are integrated but without cointegrating relationship Axx is a null matrix.
By partitioning the vectors α0, α1, the matrix A(1) and the polynomial matrix Γ(L) conformably to zt, as follows

(35)α0=[α0y(1,1)α0x(K,1)],α1=[α1y(1,1)α1x(K,1)]

(36)A(1)=[a(y)(1,K+1)A(x)(K,K+1)]=[ayy(1,1)ayx(1,K)axy(K,1)Axx(K,K)],Γ(L)=[γy(L)(1,K+1)Γ(x)(L)(K,K+1)]=[γyy(L)(1,1)γyx(L)(1,K)γxy(L)(K,1)Γxx(L)(K,K)] , and substituting (5) into (2) yields

(37)Δzt=[ΔytΔxt]=[α0.yα0x]+[α1.yα1x]t[a(y).xA(x)][yt1xt1]+[γy.x(L)Γ(x)(L)]Δzt+[ωΔxt0]+[νytεxt] , where

(38)α0.y=α0yωα0x,α1.y=α1yωα1x

(39)a(y).x=a(y)ωA(x),γy.x(L)=γy(L)ωΓ(x)(L).

According to (37), the long-run relationships of the VECM turn out to be now included in the matrix

(40)[a(y).xA(x)]=[ayyωaxyayxωAxxaxyAxx].

To rule out the presence of long-run relationships between yt and xt in the marginal model, the xt variables are assumed to be exogenous with respect to the ARDL parameters, that is axy is assumed to be a null vector. Accordingly, the long-run matrix in (40) becomes

(41)A~=[ayyayxωAxx0Axx]=[ayya~y.x0Axx]=[byycyybyycyx+(byxωBxx)Cxx0BxxCxx].

After these algebraic transformations, the ARDL equation for Δyt can be rewritten as in (6).
In light of the factorization (34) of the matrix A(1), the long-run equilibrium vector θ can be expressed as

(42)θ=1ayy[byy(byxωBxx)](1,r+1)[cyxCxx](r+1,K),

where

a~y.x=ayxωAxx.
Bearing in mind that Cxx is the cointegrating matrix for the variables xt, the equation (42) leads to the following conclusion

(43)rk[cyxCxx]={rytI(0)r+1ytI(1), where r=rk(Axx) and 0rK.

Section B - Intercept and trend specifications

introduced five different specifications for the ARDL model, which depend on the deterministic components that can be absent or restricted to the values they assume in the parent VAR model. In this connection, note that, in light of (33), the drift and the trend coefficient in the conditional VECM (37) are defined as α0c=A~(1)(μη)+Γ~(1)η,α1c=A~(1)η, where A~(1) is as in (41) and Γ~(1)=[γy.x(1)Γ(x)(1)].
Accordingly, after partitioning the mean and the drift vectors as μ(1,K+1)=[μy(1,1),μx(1,K)],η(1,K+1)=[ηy(1,1),ηx(1,K)], the intercept and the coefficient of the trend of the ARDL equation (6) are defined as α0.yEC=e1α0c=ayyμya~y.xμx+γy.x(1)η=ayy(μyθμx)+γy.x(1)η,θ=a~y.xayy

α1.yEC=e1α1c=ayyηya~y.xηx=ayy(ηyθηx), where e1 is the K+1 first elementary vector.
In the error correction term ECt1=yt1θ0θ1tθxt1 the parameters that partake in the calculation of intercept and trend are θ0=μyθμx,θ1=ηyθηx. In particular, these latter are not null only when they are assumed to be restricted in the model specification.
The five specifications proposed by   are

  1. No intercept and no trend: μ=η=0. It follows that θ0=θ1=α0.y=α1.y=0. Accordingly, the model is as in (14).

  2. Restricted intercept and no trend: α0c=A~(1)μ,η=0, which entails θ00α0.yEC=ayyθ0,α0.y=θ1=α1.y=0. Therefore, the intercept stems from the EC term of the ARDL equation. The model is specified as in (15)

  3. Unrestricted intercept and no trend: α0cA~(1)μ,η=0. Thus, α0.y0,θ0=θ1=α1.y=0. Accordingly, the model is as in (16).

  4. Unrestricted intercept, restricted trend: α0cA~(1)(μη)+Γ~(1)ηα1c=A~(1)η, which entails α0.y0,θ0=0θ10α1.yEC=ayyθ1α1.y=0. Accordingly, the trend stems from the EC term of the ARDL equation. The model is as in (17).

  5. Unrestricted intercept, unrestricted trend: α0cA~(1)(μη)+Γ~(1)ηα1cA~(1)η. Accordingly, α0.y0α1.y0,θ0=θ1=0. The model is as in (18).

8 CRAN packages used

bootCT, dynamac, magrittr, gtools, pracma, Rcpp, RcppArmadillo, Rmisc, ARDL, aod, vars, urca, aTSA, tseries, reshape2, ggplot2, stringr, tidyverse, dplyr, ggplot

9 CRAN Task Views implied by cited packages

ChemPhys, Databases, DifferentialEquations, Econometrics, Environmetrics, Finance, HighPerformanceComputing, MixedModels, ModelDeployment, NumericalMathematics, Phylogenetics, Spatial, TeachingStatistics, TimeSeries

10 Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Footnotes

  1. The R packages, either used in the creation of bootCT or employed in the analyses presented in this paper, are magrittr , gtools , pracma , Rcpp , RcppArmadillo , Rmisc , dynamac , ARDL , aod , vars and urca , aTSA , tseries , reshape2, ggplot2 and stringr , tidyverse and dplyr .[↩]
  2. If the explanatory variables are stationary Axx is non-singular (rk(Axx)=K), while when they are integrated but without cointegrating relationship Axx is a null matrix.[↩]
  3. The knowledge of the rank of the cointegrating matrix is necessary to overcome this impasse.[↩]
  4. The latter is introduced in the ARDL equation by the operation of conditioning yt on the other variables xt of the model[↩]
  5. In fact, as ωAxxxtI(0), the conclusion that ytI(0) must hold. This in turn entails that no cointegration occurs between yt and xt.[↩]
  6. If the explanatory variables are stationary Axx is non-singular (rk(Axx)=K), while when they are integrated but without cointegrating relationship Axx is a null matrix[↩]

References

K. R. Abbasi, M. Shahbaz, Z. Jiao and M. Tufail. How energy consumption, industrial growth, urbanization, and CO2 emissions affect economic growth in pakistan? A novel dynamic ARDL simulations approach. Energy, 221: 119793, 2021. DOI 10.1016/j.energy.2021.119793.
M. A. Arranz and A. Escribano. Cointegration testing under structural breaks: A robust extended error correction model. Oxford Bulletin of Economics and Statistics, 62(1): 23–52, 2000. DOI 10.1111/1468-0084.00158.
S. M. Bache and H. Wickham. magrittr: A forward-pipe operator for r. 2022. URL https://CRAN.R-project.org/package=magrittr. R package version 2.0.3.
S. Bertelli, G. Vacca and M. Zoia. Bootstrap cointegration tests in ARDL models. Economic Modelling, 116: 105987, 2022. DOI 10.1016/j.econmod.2022.105987.
B. Bolker, G. R. Warnes and T. Lumley. gtools: Various r programming tools. 2022. URL https://CRAN.R-project.org/package=gtools. R package version 3.9.4.
H. W. Borchers. pracma: Practical numerical math functions. 2022. URL https://CRAN.R-project.org/package=pracma. R package version 2.4.2.
S. Cook. The power of single equation tests for cointegration. Applied Economics Letters, 13(5): 265–267, 2006. DOI 10.1080/13504850500398534.
R. Davidson and J. G. MacKinnon. The case against JIVE. Journal of Applied Econometrics, 21(6): 827–833, 2005. DOI 10.1002/jae.873.
D. Eddelbuettel. Seamless R and C++ integration with Rcpp. New York: Springer, 2013. DOI 10.1007/978-1-4614-6868-4. ISBN 978-1-4614-6867-7.
D. Eddelbuettel, R. Francois, D. Bates, B. Ni and C. Sanderson. RcppArmadillo: Rcpp integration for the Armadillo templated linear algebra library. 2023. URL https://CRAN.R-project.org/package=RcppArmadillo. R package version 0.12.4.0.0.
R. F. Engle and C. W. Granger. Co-integration and error correction: Representation, estimation, and testing. Econometrica: journal of the Econometric Society, 251–276, 1987. DOI 10.2307/1913236.
R. F. Engle and B. S. Yoo. Forecasting and testing in co-integrated systems. Journal of Econometrics, 35(1): 143–159, 1987. DOI 10.1016/0304-4076(87)90085-6.
N. R. Ericsson and J. G. MacKinnon. Distributions of error correction tests for cointegration. The Econometrics Journal, 5(2): 285–318, 2002. DOI 10.1111/1368-423X.00085.
V. J. Gabriel, Z. Psaradakis and M. Sola. A simple method of testing for cointegration subject to multiple regime changes. Economics Letters, 76(2): 213–221, 2002.
M. Haseeb, I. S. Z. Abidin, Q. M. A. Hye and N. H. Hartani. The impact of renewable energy on economic well-being of malaysia: Fresh evidence from auto regressive distributed lag bound testing approach. International Journal of Energy Economics and Policy, 9(1): 269, 2019. DOI 10.32479/ijeep.7229.
R. M. Hope. Rmisc: Ryan miscellaneous. 2022. URL https://CRAN.R-project.org/package=Rmisc. R package version 1.5.1.
H. I. Hussain, M. A. Salem, A. Z. A. Rashid and F. Kamarudin. Environmental impact of sectoral energy consumption on economic growth in malaysia: Evidence from ARDL bound testing approach. Ekoloji Dergisi, (107): 2019.
S. Johansen. Estimation and hypothesis testing of cointegration vectors in gaussian vector autoregressive models. Econometrica: journal of the Econometric Society, 1551–1580, 1991. DOI 10.2307/2938278.
S. Jordan and A. Q. Philips. Dynamac: Dynamic simulation and testing for single-equation ARDL models. 2020. URL https://CRAN.R-project.org/package=dynamac. R package version 0.1.11.
A. Kanioura and P. Turner. Critical values for an f-test for cointegration in a multivariate model. Applied Economics, 37(3): 265–270, 2005. DOI 10.1080/00036840412331315051.
J. J. Kremers, N. R. Ericsson and J. J. Dolado. The power of cointegration tests. Oxford bulletin of economics and statistics, 54(3): 325–348, 1992. DOI 10.1111/j.1468-0084.1992.tb00005.x.
S. Kripfganz and D. C. Schneider. Response surface regressions for critical value bounds and approximate p-values in equilibrium correction models 1. Oxford Bulletin of Economics and Statistics, 82(6): 1456–1481, 2020. DOI 10.1111/obes.12377.
Lesnoff, M., Lancelot and R. aod: Analysis of overdispersed data. 2012. URL https://cran.r-project.org/package=aod. R package version 1.3.2.
H. Lütkepohl. New introduction to multiple time series analysis. Springer Science & Business Media, 2005. DOI 10.1007/978-3-540-27752-1.
J. G. Mackinnon. Critical values for cointegration tests. In Eds.), long-run economic relationship: Readings in cointegration, 1991. Oxford Press.
G. S. Maddala and I.-M. Kim. Unit roots, cointegration, and structural change. 1998. DOI 10.1017/CBO9780511751974.
R. McNown, C. Y. Sam and S. K. Goh. Bootstrapping the autoregressive distributed lag test for cointegration. Applied Economics, 50(13): 1509–1521, 2018. DOI 10.1080/00036846.2017.1366643.
A. N. Menegaki. The ARDL method in the energy-growth nexus field; best implementation strategies. Economies, 7(4): 105, 2019. DOI 10.3390/economies7040105.
T. C. Mills and E. J. Pentecost. The real exchange rate and the output response in four EU accession countries. Emerging Markets Review, 2(4): 418–430, 2001. DOI 10.1016/S1566-0141(01)00027-9.
P. K. Narayan. The saving and investment nexus for china: Evidence from cointegration tests. Applied economics, 37(17): 1979–1990, 2005. DOI 10.1080/00036840500278103.
P. K. Narayan and R. Smyth. Crime rates, male youth unemployment and real income in australia: Evidence from granger causality tests. Applied Economics, 36(18): 2079–2095, 2004. DOI 10.1080/0003684042000261842.
K. Natsiopoulos and N. Tzeremes. ARDL: ARDL, ECM and bounds-test for cointegration. 2021. URL https://CRAN.R-project.org/package=ARDL. R package version 0.1.1.
M. H. Pesaran, Y. Shin and R. J. Smith. Bounds testing approaches to the analysis of level relationships. Journal of applied econometrics, 16(3): 289–326, 2001. DOI 10.1002/jae.616.
B. Pfaff. Analysis of integrated and cointegrated time series with r. Second New York: Springer, 2008a. URL https://www.pfaffikus.de. ISBN 0-387-27960-1.
B. Pfaff. VAR, SVAR and SVEC models: Implementation within R package vars. Journal of Statistical Software, 27(4): 2008b. URL https://www.jstatsoft.org/v27/i04/.
D. Qiu. aTSA: Alternative time series analysis. 2015. URL https://CRAN.R-project.org/package=aTSA. R package version 3.1.2.
A. M. Reda and E. Nourhan. Using the ARDL bound testing approach to study the inflation rate in egypt. Economic consultant, (3 (31)): 24–41, 2020. DOI 10.46224/ecoc.2020.3.2.
C. Y. Sam, R. McNown and S. K. Goh. An augmented autoregressive distributed lag bounds test for cointegration. Economic Modelling, 80: 130–141, 2019. DOI 10.1016/j.econmod.2018.11.001.
A. Trapletti and K. Hornik. tseries: Time series analysis and computational finance. 2023. URL https://CRAN.R-project.org/package=tseries. R package version 0.10-54.
G. Vacca and S. Bertelli. bootCT: Bootstrapping the ARDL tests for cointegration. 2023. R package version 2.0.0.
H. Wickham. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York, 2016. URL https://ggplot2.tidyverse.org.
H. Wickham. Reshaping data with the reshape package. Journal of Statistical Software, 21(12): 1–20, 2007. URL http://www.jstatsoft.org/v21/i12/.
H. Wickham. stringr: Simple, consistent wrappers for common string operations. 2022. URL https://CRAN.R-project.org/package=stringr. R package version 1.5.0.
H. Wickham, M. Averick, J. Bryan, W. Chang, L. D. McGowan, R. François, G. Grolemund, A. Hayes, L. Henry, J. Hester, et al. Welcome to the tidyverse. Journal of Open Source Software, 4(43): 1686, 2019. DOI 10.21105/joss.01686.
H. Wickham, R. François, L. Henry, K. Müller and D. Vaughan. dplyr: A grammar of data manipulation. 2023. URL https://CRAN.R-project.org/package=dplyr. R package version 1.1.2.
V. Yilanci, S. Bozoklu and M. S. Gorus. Are BRICS countries pollution havens? Evidence from a bootstrap ARDL bounds testing approach with a fourier function. Sustainable Cities and Society, 55: 102035, 2020. DOI 10.1016/j.scs.2020.102035.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Vacca, et al., "bootCT: An R Package for Bootstrap Cointegration Tests in ARDL Models", The R Journal, 2025

BibTeX citation

@article{RJ-2024-003,
  author = {Vacca, Gianmarco and Zoia, Maria and Bertelli, Stefano},
  title = {bootCT: An R Package for Bootstrap Cointegration Tests in ARDL Models},
  journal = {The R Journal},
  year = {2025},
  note = {https://doi.org/10.32614/RJ-2024-003},
  doi = {10.32614/RJ-2024-003},
  volume = {16},
  issue = {1},
  issn = {2073-4859},
  pages = {39-66}
}