lpirfs: An R Package to Estimate Impulse Response Functions by Local Projections

Philipp Adämmer

1 Introduction

Since the seminal paper of (Sims 1980), analysing economic time series by Vector Auto Regressive (VAR) models has become a main pillar in empirical macroeconomic analysis. VARs have been traditionally used to recover structural shocks in order to estimate their propagating effects on economic variables. This approach, however, has been criticized for several drawbacks such as the imposed dynamics on the (economic) system, the curse of dimensionality and the more difficult application to nonlinearities (Auerbach and Gorodnichenko 2013).

Estimating impulse response functions using local projections (LPs) has become an appealing alternative, which is reflected in the over 1,000 citations of the pioneering paper by (Jordà 2005). Instead of extrapolating the parameters into increasingly distant horizons, LPs estimate the parameters sequentially at each point of interest. It is argued that LPs offer three advantages over the traditional structural vector autoregressive (SVAR) approach: first, LPs are easier to estimate since they rely solely on simple linear regressions; second, the point or joint-wise inference is easily conducted; and third, impulse responses that are estimated using LPs are more robust when a (linear) VAR is misspecified (Jordà 2005). Although the latter argument has been questioned by (Kilian and Kim 2011), the recent study by (Brugnolini 2018) shows equal and even better performance of LPs when the lag lengths for each forecast horizon are adequately fixed. Yet, (Plagborg-Møller and Wolf 2019) proved that LPs and VAR models estimate the same impulse responses when the lag structures are unrestricted. This finding implies that empirical impulse responses that are estimated using LPs and SVARs are likely similar at short horizons but differ at longer ones.¹

Since their introduction in 2005, LPs have been broadly applied to investigate, among others, the macroeconomic effects of oil price shocks (Hamilton 2011); state-dependent government spending multipliers (Auerbach and Gorodnichenko 2012; Auerbach and Gorodnichenko 2013; Owyang et al. 2013); the effects of monetary policy on financial markets and economic aggregates (Tenreyro and Thwaites 2016; Swanson 2017; Jordà et al. 2019); and the link between credit growth, monetary policy, house prices, and financial stability (Favara and Imbs 2015; Jordà et al. 2015; Jordà and Taylor 2016). Apart from the different research questions, these studies further differ regarding the data structures because they use panel and nonpanel data.

Despite the rising popularity and applications, no R package yet exists that can estimate impulse responses using LPs. The only exception is the code for the smooth LP approach by (Barnichon and Brownlees 2019). The approach reduces the variance of the LP parameters with a linear B-spline basis function because LP coefficients can suffer from high variance, sometimes making the interpretation more difficult. The code is partly available on GitHub and has been applied by (Garı́n et al. 2019). The vars package by (Pfaff 2008) only allows estimating impulse response functions that are based on the traditional SVAR approach.

As a remedy, this paper introduces lpirfs (Adämmer 2019), a fast and flexible R package that enables estimating and visualizing impulse responses using LPs for a variety of data sets. The first part of this paper outlines the theory of LPs and the differences from the traditional SVAR approach. The second part outlines the main functions and options of the package, and the last section applies lpirfs by replicating the empirical results from the economic literature.

2 Estimating impulse response functions using local projections

An SVAR with \(n\) variables can be written as follows:

\[\begin{aligned} \label{VAR_struct} \nonumber \begin{pmatrix} \beta_{11}^0 & \dots & \beta_{1n}^{0}\\ \vdots & \ddots & \vdots \\ \beta_{n1}^0 & \dots & \beta_{nn}^0 \end{pmatrix} \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}_{t} = &\begin{pmatrix} \alpha_1 \\ \vdots \\ \alpha_n \end{pmatrix} + \begin{pmatrix} \beta_{11}^1 & \dots & \beta_{1n}^{1}\\ \vdots & \ddots & \vdots \\ \beta_{n1}^1 & \dots & \beta_{nn}^1 \end{pmatrix} \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}_{t-1} + \dots + \\ &\begin{pmatrix} \beta_{11}^p & \dots & \beta_{1n}^{p}\\ \vdots & \ddots & \vdots \\ \beta_{n1}^p & \dots & \beta_{nn}^p \end{pmatrix} \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}_{t-p} + \begin{pmatrix} \varepsilon_1 \\ \vdots \\ \varepsilon_n \end{pmatrix}_{t}, \nonumber \end{aligned} \tag{1}\]

which more concisely becomes the following:

\[\nonumber \mathbf{B_0} \mathbf{Y}_t = \boldsymbol{\alpha}_t + \mathbf{B(L)} \mathbf{Y}_t + \boldsymbol{\varepsilon}_t.\]

The residuals \(\boldsymbol{\varepsilon}_t\) are assumed to be white noise with zero mean.² This representation is appealing from an economic perspective because the structural shocks are contemporaneously uncorrelated, and the variables have a contemporaneous effect on each other. The contemporaneous effect is measured by the square matrix \(\mathbf{B_0}\). However, estimating this SVAR without further assumptions is not possible because of the simultaneous identification problem. Merely assuming that the structural shocks are orthogonal does not fully identify the system.

The SVAR in reduced form (henceforth VAR) equals: \[\label{reduced_VAR} \nonumber \mathbf{Y}_t = \boldsymbol{\tilde{\alpha}} + \boldsymbol{\tilde{B}(L)} \mathbf{Y}_t + \boldsymbol{u}_t, \tag{2}\]

where \[\boldsymbol{\tilde{\alpha}} = \mathbf{B_0}^{-1}\boldsymbol{\alpha}, \ \boldsymbol{\tilde{B}(L)} = \mathbf{B_0}^{-1} \mathbf{B(L)}\]
and \[E[\boldsymbol{u}_t, \boldsymbol{u}_\tau^{'} ]= \left\{ \begin{array}{l@{\quad \quad}l} \begin{pmatrix} \sigma_{1}^2 & \dots & \sigma_{1,n}^2 \\ \vdots & \ddots & \vdots \\ \sigma_{n,1} & \dots & \sigma_{n}^2 \end{pmatrix} ,& for \ \ t = \tau \\ \ 0, & else. \end{array} \right.\]

The coefficient matrix \(\boldsymbol{\tilde{B}(L)}\) is a nonlinear function of the contemporaneous parameter matrix \(\mathbf{B_0}\) and the structural parameter matrix \(\boldsymbol{B(L)}\). In contrast to the SVAR, the VAR residuals \(\boldsymbol{u}_t\) are contemporaneously correlated, which impedes an unbiased economic interpretation. The VAR residuals are assumed to be linked to the SVAR shocks by the following:

\[\mathbf{u_t} = \mathbf{B_0}^{-1}\boldsymbol{\varepsilon_t}, \quad \quad E[\mathbf{u_t u_t^{'}} ] = \mathsf{\Sigma}_\mathbf{u} = \mathbf{B_0}^{-1} \mathbf{B_0}^{-1'}.\]

Given that the covariance matrix of \(\boldsymbol{\varepsilon}_t\) equals the identity matrix, one must still impose \(n(n-1)/2\) restrictions to estimate the structural form. The most general approach is to separate the residuals into orthogonal shocks by calculating a Cholesky decomposition of the covariance matrix \(\mathsf{\Sigma}_u\). The first variable in such a system responds to its own exogenous shock, the second variable to the first variable plus an exogenous shock to the second variable, and so on. The results thus depend on ordering (Keating 1992). The Wold representation states that any covariance-stationary time series can be rewritten as a sum of present and past innovations. This theorem enables mapping the estimated VAR(\(p\)) coefficients recursively to the infinite-order vector moving-average coefficients (Brugnolini 2018). Impulse response functions are estimated iteratively by rewriting VAR(\(p\)) into its companion form (i.e., a VAR(1)):

\[\begin{aligned} \nonumber \hat{IR}(0) &= \mathbf{B_0}^{-1} \\ \nonumber \hat{IR}(1) &= \boldsymbol{\Phi^{1}} \mathbf{B_0}^{-1} \\ \nonumber \hat{IR}(2) &= \boldsymbol{\Phi^{2}} \mathbf{B_0}^{-1} \\ \nonumber \vdots \ \ , \end{aligned}\]

where the matrix \(\boldsymbol{\Phi}\) contains the coefficients of the VAR(1).

In his pioneering paper, (Jordà 2005) proposed an alternative approach to estimate impulse responses. His first step consists of ordinary least squares (OLS) regressions for each forecast horizon:

\[\label{eq_lp} \mathbf{y}_{t+h} = \boldsymbol{\alpha}^h + \mathbf{B}_1^h \mathbf{{y}}_{t-1} + \dots + \mathbf{B}_p^{h} \mathbf{{y}}_{t-p} + \mathbf{u}_{t+h}^h, \quad h = 0,1, \ \dots \ ,H-1, \tag{3}\]

where \(\boldsymbol{\alpha}^h\) is a vector of constants, and \(\mathbf{B}_i^h\) are parameter matrices for lag \(p\) and forecast horizon \(h\). The vector elements \(\mathbf{u}_{t+h}^h\) are autocorrelated and/or heteroscedastic disturbances. The collection of all regressions of Eq. ((3)) are called LPs. The slope matrix \(\mathbf{B}_1^h\) can be interpreted as the response of \(\mathbf{y}_{t+h}\) to a reduced form shock in \(t\) (Kilian and Kim 2011). Structural impulse responses are then estimated by the following:

\[\label{eq_lp_ir} \hat{IR}(t, h, \mathbf{d}_i) = \mathbf{\hat{B}}_1^h\mathbf{d}_i, \tag{4}\]

where \(\mathbf{d}_i = \mathbf{B}_0^{-1}.\) As in the SVAR approach, the shock matrix \(\mathbf{d}_i\) must be identified from a linear VAR. The LP approach thus does not overcome the problem of identification. Given the serial correlation of \(\mathbf{u}_{t+h}^h\), (Jordà 2005) proposed to estimate robust standard errors using the approach by (Newey and West 1987).

A great advantage of LPs is their easy extension to nonlinear frameworks. The simplest approach to separate data into two regimes is using a binary (dummy) variable. The drawback, however, is that it lowers the degrees of freedom. As a remedy, (Auerbach and Gorodnichenko 2012) proposed computing state probabilities with a logistic function that allows using all observations for the estimations. The logistic function equals the following:

\[\begin{aligned} F({z_t}) & = \ \frac{e^{(-\gamma z_t)}}{\left(1 + e^{(-\gamma z_t)}\right)}, \label{switching_variable_1} \\ \end{aligned} \tag{5}\]

\[\begin{aligned} var(z_t) & = 1, \ E(z_t) = 0, \label{switching_variable_2} \end{aligned} \tag{6}\]

where \(z_t\) is standardized so that \(\gamma\) (> 0) is scale-invariant. The value of \(\gamma\) must be provided by the user. For example, if \(z_t\) corresponds to changes in the gross domestic product (GDP) at time \(t\), an increase in \(z_t\) would lead to a decrease in \(F(z_t)\). Values close to zero of \(F(z_t)\) would thus indicate periods of economic expansion. (Auerbach and Gorodnichenko 2013) proposed standardizing the cyclical components of the filter according to the method by (Hodrick and Prescott 1997) to obtain the variable \(z_t\). The observations for the two regimes are the product of the transition function and the endogenous variables:

\[\begin{aligned} \label{regime_values} \begin{aligned} \text{Regime 1} \ (R_1) &: \mathbf{y}_{t-l}\cdot(1-F(z_{t-1})), \qquad l = 1, \ \dots \ , p, \\ \text{Regime 2} \ (R_2) &:\mathbf{y}_{t-l}\cdot F(z_{t-1}), \hskip 4.5em l = 1, \ \dots \ , p. \end{aligned} \end{aligned} \tag{7}\]

(Auerbach and Gorodnichenko 2012) used the values of the transition function at \(t-1\) to avoid contemporaneous feedback from policy actions regarding whether the economy is in a recession or an expansion. Structural nonlinear impulse responses are estimated using the following:

\[\begin{aligned} \nonumber \hat{IR}^{R_1}(t,h,\mathbf{d}_i) & = \hat{\mathbf{B}}_{1, R_1}^h \mathbf{d}_i, \ \ \ \ \ \ \ \ \ h = 0, \ \dots \ , H-1, \\ \nonumber \hat{IR}^{R_2}(t,h,\mathbf{d}_i) & = \hat{\mathbf{B}}_{1, R_2}^h \mathbf{d}_i, \ \ \ \ \ \ \ \ \ h = 0, \ \dots \ , H-1, \end{aligned}\]

where \(\hat{\mathbf{B}}_{1, R1}^0 = I\) and \(\hat{\mathbf{B}}_{1, R2}^0 = I\). The coefficient matrices \(\hat{\mathbf{B}}_{1, R_1}^h\) and \(\hat{\mathbf{B}}_{1, R_2}^h\) are obtained from the following LPs:

\[\begin{aligned} \label{nonlinear_lp} \begin{aligned} \mathbf{y}_{t+h} = \boldsymbol{\alpha}^h \ + & \ \mathbf{B}_{1, R_1}^{h} \left(\mathbf{y}_{t-1}\cdot(1-F(z_{t-1})\right) \ + \ \dots \ +\ \mathbf{B}_{p, R_1}^{h} \left(\mathbf{y}_{t-p}\cdot(1-F(z_{t-1})\right) + \\ & \ \mathbf{B}_{1, R_2}^{h} \left(\mathbf{y}_{t-1}\cdot F(z_{t-1}\right)) \ \ \quad \quad + \ \dots \ + \ \mathbf{B}_{p, R_2}^{h} \left(\boldsymbol{y}_{t-p}\cdot F(z_{t-1}\right)) \quad \quad + \ \mathbf{u}_{t+h}^h, \end{aligned} \end{aligned} \tag{8}\]

with \(h = 0, \ \dots \ , H-1.\) This nonlinear approach has been used by (Ahmed and Cassou 2016) to investigate the effect of consumer confidence on durable goods during periods of economic expansion and recession.

Estimating impulse responses with an identified shock

Besides the easy extension to nonlinear frameworks, another advantage of LPs is their flexible application to situations in which an exogenous shock can be identified outside of an SVAR. For example, (Ramey and Zubairy 2018) constructed a military news shock to investigate whether US government spending multipliers are higher during periods of economic slack or when interest rates are near the zero lower bound. Once an exogenous shock has been identified, impulse responses can be directly estimated using OLS regressions:

\[\label{eq_lp_lin_iv} y_{t+h} = \alpha^h + \beta_h shock_t + \phi x_{t} + u_{t+h}^h, \quad h = 0,1, \ \dots \ ,H-1, \tag{9}\]

where \(\alpha^h\) denotes the regression constant, \(x_{t}\) is a vector of control variables, and \(shock_t\) is the identified shock variable. The coefficient \(\beta_h\) corresponds to the response of \(y\) at time \(t + h\) to the shock variable (\(shock\)) at time \(t\). The impulse responses are the sequence of all estimated \(\beta_h\). As above, robust standard errors can be estimated using the approach by (Newey and West 1987). If the shock variable is endogenous, \(shock_t\) can be estimated using the two-stage least squares (2SLS) regression. In the case of nonlinearities, the variables can either be multiplied with a dummy variable or with the values of the transition function in Eq. ((7)).

Estimating impulse responses for panel data

Another advantage of LPs is that they can be applied to panel data as well. Estimating impulse responses based on panel data have been put forward by (Auerbach and Gorodnichenko 2013), (Owyang et al. 2013), and (Jordà et al. 2015), among others. The general equation for panel data is the following:

\[\nonumber y_{i,t+h} = \alpha_{i,h} + shock_{i,t}\beta_h + {x}_{i,t}\boldsymbol{\gamma}_h + \varepsilon_{i, t + h}, \quad h = 0,1, \ \dots \ ,H-1,\]

where \(\alpha_{i, h}\) denotes (cross-section) fixed effect, \({x}_{i,t}\) is a vector of control variables, and \(shock_{i,t}\) denotes the identified shock variable. Besides using the absolute values of \(y_t\), lpirfs also allows estimating cumulative impulse responses using (\(y_{i,t+h} - y_{i,t-1}\)) as the endogenous variable, which is often done for panel data (see, e.g., (Jordà et al. 2015)). Similar to the univariate approach, \(shock_t\) can also be first estimated by an instrument variable approach (see, e.g., (Jordà et al. 2019)). It is further crucial to account for heteroskedasticity and autocorrelation in panel models. The importance of robust standard errors in the context of corporate finance and asset pricing has been shown by (Petersen 2009).

3 The lpirfs package

lpirfs enables estimating all of the above models and specifications. The main functions of the package are the following:

lp_lin() and lp_nl(), which estimate linear and nonlinear impulse responses based on structural VARs,
lp_lin_iv() and lp_nl_iv(), which estimate linear and nonlinear impulse responses for a shock that has been identified outside of the VAR, and
lp_lin_panel() and lp_nl_panel(), which estimate linear and nonlinear impulse responses for panel data.

The functions lp_lin() and lp_nl() estimate linear and nonlinear impulse responses based on SVARs whose shocks are identified by the Cholesky decomposition. The functions lp_lin_iv() and lp_nl_iv() allow estimating impulse responses when a shock has been identified outside of the VAR. The functions lp_lin_panel() and lp_nl_panel() can be used for panel data.

Nonpanel functions rely on several routines written with Rcpp by (Eddelbuettel et al. 2011), making the computations very fast. The functions for panel data are based on the well-established plm package by (Croissant and Millo 2008). Parallel computation, which is optional and available for all functions, can further reduce the computation time.

Nonpanel functions allow computing ordinary and heteroskedasticity and autocorrelation consistent (HAC) estimators for the impulse responses based on the approach by (Newey and West 1987). By default, lpirfs increases the truncation parameter with the number of horizons \(h\), but a fixed value can also be manually provided. In addition, pre-whitening is another option that might improve the confidence interval coverage (Andrews and Monahan 1992). The user can also apply an information criterion for each forecast horizon to find the optimal number of lags. The included criteria are those developed by (Akaike 1974), (Schwarz 1978), and (Hurvich and Tsai 1989). The endogenous, exogenous, and switching variables must be given separately as a data.frame. Table 1 summarizes the input options of the nonpanel functions.

Applying lpirfs to panel data works slightly differently than using nonpanel functions due to the dependency on the plm package. Instead of providing the endogenous and exogenous variables separately, the user must provide the entire panel data set first, and then give the column names for the endogenous, exogenous, and other variables. The default is to estimate a fixed-effects model, but all options available for the plm package are also available within lpirfs. Table 2 summarizes the input options for the linear and nonlinear panel functions.

Each function output becomes an S3 object, which enables using the generic R functions plot() and summary(). In addition, the package also contains the functions plot_lin() and plot_nl(), which enable creating individual graphs of impulse responses.

Table 1: Comparison of linear and nonlinear LP functions.
	Function names
Input name	`lp_lin()`	`lp_nl()`	`lp_lin_iv()`	`lp_nl_iv()`	Input description
endog_data	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Data.frame with endogenous variables for VAR model.
shock	-	-	\(\checkmark\)	\(\checkmark\)	One column data.frame with the identified shock.
use_twosls	-	-	\(\checkmark\)	-	Option to estimate shock with 2SLS approach.
instrum	-	-	\(\checkmark\)	-	Data.frame with the instrument(s) for the 2SLS approach.
lags_endog_lin	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	-	Number of lags for linear model.
lags_endog_nl	-	\(\checkmark\)	-	\(\checkmark\)	Number of lags for nonlinear model in Eq. ((8)).
lags_criterion	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Choose lags based on information criterion (AICc, AIC or BIC).
max_lags	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Maximum number of lags for information criterion.
trend	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Options to include constant, trend and quadratic trend.
shock_type	\(\checkmark\)	\(\checkmark\)	-	-	Two types of shock: standard deviation or unit shock.
use_nw	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Option to estimate standard errors by (Newey and West 1987).
nw_lag	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Option to manually fix the truncation parameter.
nw_prewhite	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Option for pre-whitening (Andrews and Monahan 1992).
adjust_se	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Option to adjust standard errors for small samples.
confint	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Value of width for confidence bands.
hor	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Number of horizons for impulse responses.
switching	-	\(\checkmark\)	-	\(\checkmark\)	Switching variable \(z_t\). See Eq. (5).
lag_switching	-	\(\checkmark\)	-	\(\checkmark\)	Option to lag the values of the logistic function \(F(z_t)\).
use_logistic	-	\(\checkmark\)	-	\(\checkmark\)	Option to use the logistic function. See Eq. (5).
use_hp	-	\(\checkmark\)	-	\(\checkmark\)	Option to use the filter by (Hodrick and Prescott 1997).
lambda	-	\(\checkmark\)	-	\(\checkmark\)	Value of \(\lambda\) for the HP-filter. See (Ravn and Uhlig 2002).
gamma	-	\(\checkmark\)	-	\(\checkmark\)	Value of \(\gamma\). See Eq. ((5)).
exog_data	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Optional data for exogenous variables.
lags_exog	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Number of lags for exogenous variables.
contemp_data	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Variables with contemporaneous impact.
num_cores	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	Option to choose number of cores.

The table compares the options for lp_lin(), lp_nl(), lp_lin_iv(), and lp_nl_iv(). The symbol \(\checkmark\) indicates whether the option, denoted by the row, is available for the function, denoted by the column. The optional lag length criteria are those by (Hurvich and Tsai 1989), (Akaike 1974) and (Schwarz 1978).

Table 2: Comparison of linear and nonlinear LP functions for panel data.
	Function names
Input name	`lp_lin_panel()`	`lp_nl_panel()`	Input description
data_set	\(\checkmark\)	\(\checkmark\)	A data.frame, containing the panel data set.
data_sample	\(\checkmark\)	\(\checkmark\)	Option to estimate a subset of the data.
endog_data	\(\checkmark\)	\(\checkmark\)	Character name of the endogenous variable.
cumul_mult	\(\checkmark\)	\(\checkmark\)	Option to estimate cumulative multipliers.
shock	\(\checkmark\)	\(\checkmark\)	Character name of the variable to shock with.
diff_shock	\(\checkmark\)	\(\checkmark\)	Option to use first differences of the shock variable.
iv_reg	\(\checkmark\)	-	Option to use instrument variable approach.
instrum	\(\checkmark\)	-	The name(s) of the instrument variable(s).
panel_model	\(\checkmark\)	\(\checkmark\)	Option to choose type of panel model. See plm package.
panel_effect	\(\checkmark\)	\(\checkmark\)	The effects introduced in the panel-model. See plm package.
robust_cov	\(\checkmark\)	\(\checkmark\)	Options for robust covariance matrix estimator. See plm package.
robust_method	\(\checkmark\)	\(\checkmark\)	Option for robust_cov. See plm package.
robust_type	\(\checkmark\)	\(\checkmark\)	Option for robust_cov. See plm package.
robust_cluster	\(\checkmark\)	\(\checkmark\)	Option for robust_cov. See plm package.
robust_maxlag	\(\checkmark\)	\(\checkmark\)	Option for robust_cov. See plm package.
use_gmm	\(\checkmark\)	\(\checkmark\)	Option to use GMM for estimation.
gmm_model	\(\checkmark\)	\(\checkmark\)	Option to use "onestep" or "twosteps" approach. See plm package.
gmm_effect	\(\checkmark\)	\(\checkmark\)	The effects introduced in the panel-model. See plm package.
gmm_transformation	\(\checkmark\)	\(\checkmark\)	Additional option for GMM model. See plm package.
c_exog_data	\(\checkmark\)	\(\checkmark\)	Name(s) of the exogenous variable(s) with contemporaneous impact.
l_exog_data	\(\checkmark\)	\(\checkmark\)	Name(s) of the exogenous variable(s) with lagged impact.
lags_exog_data	\(\checkmark\)	\(\checkmark\)	Lag length for the exogenous variable(s) with lagged impact.
c_fd_exog_data	\(\checkmark\)	\(\checkmark\)	Exogenous variable(s) with contemporaneous impact of first differences.
l_fd_exog_data	\(\checkmark\)	\(\checkmark\)	Exogenous variable(s) with lagged impact of first differences.
lags_fd_exog_data	\(\checkmark\)	\(\checkmark\)	Number of lags for variable(s) with impact of first differences.
confint	\(\checkmark\)	\(\checkmark\)	Value of width for confidence bands.
switching	-	\(\checkmark\)	Column name of the switching variable.
use_logistic	-	\(\checkmark\)	Option to use the logistic function. See Eq. (5).
use_hp	-	\(\checkmark\)	Option to use the filter by (Hodrick and Prescott 1997) to obtain \(z_t\).
lag_switching	-	\(\checkmark\)	Option to lag the values of the logistic function \(F(z_t)\).
lambda	-	-	Value of \(\lambda\) for the HP-filter. See (Ravn and Uhlig 2002).
gamma	-	-	Value of \(\gamma\). See Eq. ((5)).
hor	\(\checkmark\)	\(\checkmark\)	Number of horizons for impulse responses.

The table compares the options for lp_lin_panel() and lp_nl_panel(). The symbol \(\checkmark\) indicates whether the option, denoted by the row, is available for the function, denoted by the column. The functions estimate linear and nonlinear impulse responses for models with panel data.

4 Examples and replications

In this section, I apply all the main functions of lpirfs to three different settings. The impulse responses are visualized by the generic plot() function, which serves as a wrapper for the built-in functions plot_lin() and plot_nl().

Two exercises replicate empirical results by (Jordà 2005) and (Ramey and Zubairy 2018). The data sets are included in lpirfs. The third example uses the external Jordà-Schularick-Taylor Macrohistory Database, which covers 17 advanced economies since 1870 on an annual basis and comprises 25 real and nominal variables. I estimate how an increase in the interest rate affects mortgage lending. This example is based on a STATA code provided on Oscar Jordá’s website. Due to copyright issues, the database could not be included in the package, but I show how it can be easily downloaded with R.

Traditional approach: Replicating results by Jordà (2005)

The following code replicates parts of Figure 5 in Jordà (2005 176). It shows how the output gap, inflation rate, and federal funds rate react to the corresponding structural shocks. The results are shown in Figure 1.³ lpirfs follows the convention by (Jordà 2005), namely that the first horizon, denoted on the \(x\)-axis, equals \(h = 0\). Applying the generic function summary() to the output returns a list of several matrices with OLS diagnostics. The first level of the list corresponds to the shock variable and the second level to the horizon. Table 3 shows OLS diagnostics for the first horizon of the first shock (output gap).

    
# --- Code to replicate Figure 5 in Jordá (2005, p. 176)
                        
# Load packages 
  library(lpirfs)
    
# Load data set
  endog_data  <- interest_rules_var_data
        
# Estimate linear model
  results_lin <- lp_lin(endog_data     = endog_data, 
                        lags_endog_lin = 4,   
                        trend          = 0,    
                        shock_type     = 1,
                        confint        = 1.96, 
                        hor            = 12)   

# Show impulse responses
  plot(results_lin)
  
# Show OLS diagnostics for the first shock of the first horizon  
  summary(results_lin)[[1]][1] 
                
                
# --- End example code

Table 3: OLS diagnostics shown by using `summary()`.
	R.sqrd.	Adj. R.sqrd.	F.stat
h 1: GDP_gap	0.91	0.90	146.88
h 1: Infl	0.84	0.83	77.17
h 1: FF	0.94	0.93	218.74

The table shows OLS diagnostics for the (Jordà 2005) example of the first horizon for the first identified shock (output gap).

graphic without alt text — Figure 1: Replication of Figure 5 in Jordà (2005 176).

The example above only estimates impulse responses for the linear case, but (Jordà 2005) also tested for nonlinearities. Although he found no “business-cycle” asymmetries, he identified significant asymmetries for several lags of both inflation and the federal funds rate. The following code uses a dummy approach to estimate the nonlinear impulse responses of the variables to a shock in the federal funds rate. (Jordà 2005) used a threshold of 4.75% for the inflation rate, applied to its third lag. Figure 2 shows the empirical results for the nonlinear example. The results are comparable to the findings by (Jordà 2005), namely that the magnitudes of responses of inflation and output to interest rates are more responsive in the low-inflation regime (left panel) than in the high-inflation regime (right panel).

# --- Code for nonlinear effects of the federal funds rate. 

# Load packages for creating plots
  library(dplyr)
  library(gridExtra)
  library(ggpubr)
    
# Create dummy: apply threshold of 4.75 percent to the third lag of the inflation rate
  switching_data <-  if_else(dplyr::lag(endog_data$Infl, 3) > 4.75, 1, 0)

# Estimate nonlinear model 
  results_nl <- lp_nl(endog_data, 
                       lags_endog_lin = 4,              lags_endog_nl = 4,   
                       trend          = 1,              shock_type    = 0,   
                       confint        = 1.67,           hor           = 12,   
                       switching      = switching_data, lag_switching = FALSE,  
                       use_logistic   = FALSE)  
                
# Create nonlinear impulse responses
  nl_plots <- plot_nl(results_nl)   
    
# Combine and show plots using 'ggpubr' and 'gridExtra'
  single_plots      <- nl_plots$gg_s1[c(3, 6, 9)]
  single_plots[4:6] <- nl_plots$gg_s2[c(3, 6, 9)]
  all_plots         <- sapply(single_plots, ggplotGrob)
  marrangeGrob(all_plots, nrow = 3, ncol = 2, top = NULL)   
  
# --- End example code

The figure depicts nonlinear impulse responses of the output gap, inflation rate, and federal funds rate to a shock in the federal funds rate during periods of low (left panel) and high (right panel) inflation rates. The threshold of 4.75 is applied to the third lag of the inflation rate.

Using an external shock: Replicating results by Ramey and Zubairy (2018)

In this section, I replicate the empirical results by (Ramey and Zubairy 2018). The authors, among others, re-evaluate the findings by (Auerbach and Gorodnichenko 2012), who argued that government spending multipliers are more pronounced during economic recession than during economic expansion. (Auerbach and Gorodnichenko 2012) applied a smooth transition VAR (STVAR) to estimate state-dependent fiscal multipliers. (Ramey and Zubairy 2018), however, showed that the estimated fiscal multipliers are much smaller when the impulse responses are estimated using LPs. The reason is that the LP approach does not assume that the system remains in a fixed regime once it has entered it.

The following code replicates parts of Figure 12 in the supplementary appendix by Ramey and Zubairy (2018 35). The results are depicted in Figure 3. It shows how government spending and the GDP react to a government spending shock in the linear case as well as during periods of economic expansion and recession. The linear shock is identified according to (Blanchard and Perotti 2002). The absolute values of the figures differ because (Ramey and Zubairy 2018) multiplied the log output response by a conversion factor of \(5.6\).


# --- Code to replicate parts of Figure 12 in the supplementary appendix by 
# --- Ramey and Zubairy (2018, p.35)

# Load packages for creating plots
  library(gridExtra)
  library(ggpubr)

# Load data from package
  ag_data      <- ag_data
  sample_start <- 7
  sample_end   <- dim(ag_data)[1]

# Endogenous data
  endog_data <- ag_data[sample_start:sample_end,3:5]

# Shock variable
  shock      <- ag_data[sample_start:sample_end, 3]

# Estimate linear model
  results_lin_iv <- lp_lin_iv(endog_data = endog_data, lags_endog_lin = 4,
                    shock      = shock,      trend          = 0,
                    confint    = 1.96,       hor            = 20)

# Make and save linear plots
  iv_lin_plots <- plot_lin(results_lin_iv)

# Nonlinear shock (estimated by Ramey and Zubairy (2018))
  shock <- ag_data[sample_start:sample_end, 7]

# Use moving average growth rate of GDP as exogenous variable
  exog_data <- ag_data[sample_start:sample_end, 6]

# Use moving average growth rate of GDP as switching variable
  switching_variable <- ag_data$GDP_MA[sample_start:sample_end] - 0.8

# Estimate nonlinear model
  results_nl_iv <- lp_nl_iv(endog_data = endog_data, lags_endog_nl  = 3,
                            shock      = shock,      exog_data      = exog_data,
                            lags_exog  = 4,          trend          = 0,
                            confint    = 1.96,       hor            = 20,
                            switching  = switching_variable, use_hp = FALSE,
                            gamma      = 3)

# Make and save nonlinear plots
  plots_nl_iv <- plot_nl(results_nl_iv)

# Make list to save all plots 
  combine_plots <- list()

# Save linear plots in list
  combine_plots[[1]] <- iv_lin_plots[[1]]
  combine_plots[[2]] <- iv_lin_plots[[3]]

# Save nonlinear plots for expansion period
  combine_plots[[3]] <- plots_nl_iv$gg_s1[[1]]
  combine_plots[[4]] <- plots_nl_iv$gg_s1[[3]]

# Save nonlinear plots for recession period 
  combine_plots[[5]] <- plots_nl_iv$gg_s2[[1]]
  combine_plots[[6]] <- plots_nl_iv$gg_s2[[3]]
  
# Show all plots  
  lin_plots_all      <- sapply(combine_plots, ggplotGrob)
  marrangeGrob(lin_plots_all, nrow = 2, ncol = 3, top = NULL)

# --- End example code

Estimating impulse responses for panel data

Using the Jordà-Schularick-Taylor Macrohistory Database, the following example estimates the impulse responses of the ratio of mortgage lending divided by the GDP to a 1% increase in the short-term interest rate. Observations during World Wars I and II and observations after 2013 are excluded.⁴ The empirical results are shown in Figure 4. An increase in the short-term interest rate leads to a decrease in the mortgage lending rate, whose effect attenuates after approximately 8 years.

 
#--- Begin code for panel data
 
# Load libraries to download and read excel file from the website
  library(httr)
  library(readxl)
  library(dplyr)
  
# Retrieve the external JST Macrohistory Database
  url_jst <-"http://www.macrohistory.net/JST/JSTdatasetR3.xlsx"
  GET(url_jst, write_disk(jst_link <- tempfile(fileext = ".xlsx")))
  jst_data <- read_excel(jst_link, 2L)
  
# Remove observations after 2013 and swap the first two columns 
  jst_data <- jst_data                      %>%
          dplyr::filter(year <= 2013)   %>%
          dplyr::select(country, year, everything())
    
# Prepare variables
  data_set <- jst_data                                       %>%
              mutate(stir    = stir)                         %>%
          mutate(mortgdp = 100*(tmort/gdp))              %>%
          mutate(hpreal  = hpnom/cpi)                    %>%
          group_by(country)                              %>%
          mutate(hpreal  = hpreal/hpreal[year==1990][1]) %>%
          mutate(lhpreal = log(hpreal))                  %>%
    
         mutate(lhpy    = lhpreal - log(rgdppc))        %>%
          mutate(lhpy    = lhpy - lhpy[year == 1990][1]) %>%
          mutate(lhpreal = 100*lhpreal)                  %>%
          mutate(lhpy    = 100*lhpy)                     %>%
          ungroup()                                      %>%
    
          mutate(lrgdp   = 100*log(rgdppc))              %>%
          mutate(lcpi    = 100*log(cpi))                 %>%
          mutate(lriy    = 100*log(iy*rgdppc))           %>%
          mutate(cay     = 100*(ca/gdp))                 %>%
          mutate(tnmort  = tloans - tmort)               %>%
          mutate(nmortgdp = 100*(tnmort/gdp))            %>%
          dplyr::select(country, year, mortgdp, stir, ltrate, 
          lhpy, lrgdp, lcpi, lriy, cay, nmortgdp)

 
# Exclude observations from WWI and WWII
  data_sample <- seq(1870, 2016)[which(!(seq(1870, 2016) %in%
                                       c(seq(1914, 1918), 
                                         seq(1939, 1947))))]

# Estimate linear panel model with robust covariance matrix
  results_panel <- lp_lin_panel(data_set  = data_set,  data_sample  = data_sample,
                        endog_data        = "mortgdp", cumul_mult   = TRUE,
                        shock             = "stir",    diff_shock   = TRUE,
                        panel_model       = "within",  panel_effect = "individual",
                        robust_cov        = "vcovSCC", c_exog_data  = "cay",
                        c_fd_exog_data    = colnames(data_set)[c(seq(4,9),11)],
                        l_fd_exog_data    = colnames(data_set)[c(seq(3,9),11)],
                        lags_fd_exog_data = 2,      confint      = 1.67,
                        hor               = 10)
                               
# Show irfs
  plot(results_panel)

#--- End example

The following example uses the Hodrick–Prescott filter to decompose the log-GDP time series for each country to obtain the standardized variable \(z_t\) for the logistic function in Eq. (5). Figure 5 shows the impulse responses for both regimes. The mortgage lending ratio declines in both regimes, although it is more pronounced during periods of economic expansion (left panel) than periods of economic slack (right panel).


# --- Begin example 
    
# Estimate panel model
  results_panel <- lp_nl_panel(data_set   = data_set,  data_sample    = data_sample,
                        endog_data        = "mortgdp", cumul_mult     = TRUE,
                        shock             = "stir",    diff_shock     = TRUE,
                        panel_model       = "within",  panel_effect   = "individual",
                        robust_cov        = "vcovSCC", switching      = "lrgdp",
                        lag_switching     = TRUE,      use_hp         = TRUE,
                        lambda            = 6.25,      gamma          = 10,
                        c_exog_data       = "cay",     
                        c_fd_exog_data    = colnames(data_set)[c(seq(4,9),11)],
                        l_fd_exog_data    = colnames(data_set)[c(seq(3,9),11)], 
                        lags_fd_exog_data = 2,      
                        confint           = 1.67,
                        hor               = 10)

# Show irfs
  plot(results_panel)


# --- End example

5 Summary

Since the 1980s, impulse response analysis has become a cornerstone in (macro-)econometrics. The traditional approach of recovering the impulse responses recursively from a (linear) VAR has been criticized due to some drawbacks, such as the imposed dynamics on the (economic) system, the curse of dimensionality, and the more difficult application to nonlinear frameworks.

The LPs by (Jordà 2005) have become a widely applied alternative to estimate impulse response functions. This paper introduced lpirfs, an R package that provides a broad framework for estimating and visualizing impulse response functions using LPs for a variety of data sets. I replicated the empirical results from the economic literature to prove the validity of the package and to show its usefulness for future research.

6 Appendix

This appendix contains a comparison between impulse responses estimated using the vars package and lpirfs. In addition, I conduct sensitivity analyses regarding the choices of \(\gamma\) in Eq. (5) for the switching function and different values of \(\lambda\) for the filter by (Hodrick and Prescott 1997). The code for all examples can be found in the vignette of lpirfs.

Comparison of impulse responses between lpirfs and vars

(Plagborg-Møller and Wolf 2019) showed that LPs and VARs compute the same impulse responses when the lag structure is unrestricted. For empirical studies, this implies that impulse responses that are estimated by LPs and SVARs are likely to agree at short horizons but differ at longer ones. To verify this implication, I compare the impulse responses estimated using the lpirfs and vars packages. The latter relies on the common SVAR approach.

Figure 6 shows the empirical results. The black lines and gray-shaded areas are the same as in Figure 1, which replicates Figure 5 in Jordà (2005 176). The orange lines and shaded areas correspond to impulse responses estimated by the vars package (i.e., a standard SVAR). The results show that impulse responses are similar up to that horizon, which equals the lag length \(p\). For example, when the lag length \(p\) equals \(2\), the impulse responses and confidence intervals diverge very quickly (first column). When the lag length \(p\) equals \(6\), however, the impulse responses are much more similar. This finding coincides with the empirical results by (Plagborg-Møller and Wolf 2019), who compared the dynamic response of corporate bond spreads to a monetary policy shock.

Sensitivity analyses for \(\gamma\)

The nonlinear functions in lpirfs allow separating the data into two regimes by either using a dummy approach or computing state probabilities with the logistic function given in Eq. (5). The logistic function depends on the parameter \(\gamma\), which defines how sharply the two regimes are separated. To investigate how different choices of \(\gamma\) might affect the results, I compare the nonlinear impulse responses for a shock of the federal funds rate on the output gap.

Figure 7 shows the empirical results. Each column corresponds to one choice of \(\gamma\), namely \(\gamma = 1, \ 5\), and \(10\). I use the output gap as a switching variable and decompose it using the filter by (Hodrick and Prescott 1997). The penalty term \(\lambda\) is set to \(1 \ 600\) as suggested by (Ravn and Uhlig 2002). The first row of Figure 7 shows the evolution of the transition variable \(F(z_t)\), along with NBER-dated recessions. By construction, a high value of the transition variable corresponds to a low output gap (i.e., periods of economic slack). Choosing a low value of \(\gamma\) makes the regime-switching smooth, whereas higher values of \(\gamma\) cause the switching to be quick. The second and third rows show nonlinear impulse responses for Regimes 1 (economic expansion) and 2 (economic recession). Although the choice of \(\gamma\) has an effect on the results, it does not change the overall conclusion, namely that no “business-cycle” effects exist regarding the changes in the federal funds rate, which is in accordance with the findings by (Jordà 2005). At most, the effect would be very shortly negative during periods of economic downturn.

Sensitivity analyses for \(\lambda\)

To use the switching function in Eq. (5), one must provide a standardized variable \(z_t\). One option is to decompose a time series into a trend and a cyclical component using the filter by (Hodrick and Prescott 1997) (hereafter the HP-filter). lpirfs includes this option whose routine is written in Rcpp, making the computation very fast. If applied, the cyclical component of the HP-filter will be standardized and used for \(z_t\). The filter depends on a penalty term \(\lambda\), which must be given by the user. (Ravn and Uhlig 2002) argued that the parameter should be \(6.25\) for annual data, \(1 \ 600\) for quarterly data, and \(129 \ 600\) for monthly data. To see how different choices of \(\lambda\) influence the nonlinear impulse responses, I decompose the output gap for three different values of \(\lambda\) and compare the results, which are shown in Figure 8. The value of \(\gamma\) is fixed to \(5\). The first row shows the cyclical component of the HP-filter along with the NBER-dated recessions. A low value in the cyclical component denotes periods of economic downturn. Note that the results of the second column in Figure 8 are identical to those in the second column of Figure 7. In contrast to the previous analysis, empirical results do change significantly, depending on the choice of \(\lambda\). Thus, it is important to set the penalty term adequately.

7 Acknowledgement

I am grateful to the two anonymous reviewers, Philipp Wittenberg, Jon Danielsson, Rainer Schüssler, Tom Philipp Dybowski, and Detlef Steuer for their helpful comments and suggestions on the paper and package.

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

lpirfs: An R Package to Estimate Impulse Response Functions by Local Projections

1 Introduction

2 Estimating impulse response functions using local projections

Estimating impulse responses with an identified shock

Estimating impulse responses for panel data

3 The lpirfs package

4 Examples and replications

Traditional approach: Replicating results by Jordà (2005)

Using an external shock: Replicating results by Ramey and Zubairy (2018)

Estimating impulse responses for panel data

5 Summary

6 Appendix

Comparison of impulse responses between lpirfs and vars

Sensitivity analyses for \(\gamma\)

Sensitivity analyses for \(\lambda\)

7 Acknowledgement

CRAN packages used

CRAN Task Views implied by cited packages

Note

References

Reuse

Citation