An R Package for the Panel Approach Method for Program Evaluation : pampe by

The pampe package for R implements the panel data approach method for program evaluation designed to estimate the causal effects of political interventions or treatments. This procedure exploits the dependence among cross-sectional units to construct a counterfactual of the treated unit(s), and it is an appropriate method for research events that occur at an aggregate level like countries or regions and that affect only one or a small number of units. The implementation of the pampe package is illustrated using data from Hong Kong and 24 other units, by examining the economic impact of the political and economic integration of Hong Kong with mainland China in 1997 and 2004 respectively. An introduction to the panel data approach and program evaluation methods Program evaluation methodologies have long been used by social scientists to measure the effect of different economic or political interventions (treatments). The problem is, of course, that you cannot observe the outcome both under the intervention and in the absence of the intervention simultaneously, hence the need for program evaluation methods. Traditionally, comparative case studies have been the preferred method by researchers in order to compare units affected by a treatment or event (dubbed the treatment group) to one or more units not affected by this intervention (the control group). The idea is to use the outcome of the control group to obtain an approximation of what would have been the outcome of the treated group had it not been treated. In more recent years, synthetic control methods (Abadie and Gardeazabal, 2003; Abadie et al., 2010) have addressed these issues by introducing a data-driven procedure for selecting the control group. However, the synthetic control methods are not without shortcomings: since the synthetic control is calculated as a convex combination of the units in the donor pool, and thus it does not allow for extrapolation, it might be that a suitable synthetic control for our treated unit does not exist. Furthermore, the synthetic control is designed to be used with explanatory variables or covariates that help explain the variance in the outcome variable. For the cases when the researcher finds that extrapolation is needed to obtain a suitable comparison for the treated unit, or when the covariates available do not properly explain the outcome on which the effect of the treatment is intended to be measured, he or she might prefer to use the panel data approach for program evaluation by Hsiao et al. (2012). The panel data approach for constructing the counterfactual of the unit subjected to the intervention is to use other units that are not subject to the treatment to predict what would have happened to the treated unit had it not been subject to the policy intervention. The basic idea behind this approach is to rely on the correlations among cross-sectional units. They attribute the cross-sectional dependence to the presence of common factors that drive all the relevant cross-sectional units. As such, the aim of this article is to present the package pampe that implements the panel data approach for program evaluation procedures in R, which is available from the Comprehensive R Archive Network (CRAN) at http://CRAN.R-project.org/package=pampe. The main function in the package is pampe(), which computes the counterfactual for the treated unit using the modeling strategy outlined by Hsiao et al. (2012). The function includes an option to obtain placebo tests. There is an additional function robustness(), which conducts a leave-one-out robustness on the results. The data example is also from Hsiao et al. (2012), which introduced the panel data approach methodology to study the effect of the political and economic integration of Hong Kong with mainland China using other countries geographically and economically close to Hong Kong as possible controls. The article is organized as follows. The following section is a brief overview of the panel data approach as defined by Hsiao et al. (2012). The main section of the paper, titled Implementing pampe in R, demonstrates the implementation of this method and the use of the pampe package with an example, including how to perform inference and robustness checks. The panel data approach method for program evaluation The panel data approach for program evaluation exploits the dependence among cross-sectional units to construct a counterfactual of the treated unit(s), to estimate how the affected unit would have developed in the absence of an intervention. The estimated effect of the policy intervention is therefore simply the difference between the actual observed outcome of the treated unit and this estimated The R Journal Vol. 7/2, December 2015 ISSN 2073-4859 CONTRIBUTED RESEARCH ARTICLES 106 counterfactual. Hsiao et al. (2012) provide a thorough description of the methodology. Here the focus is on how this method is implemented in the pampe package and thus only a brief overview of the procedure is provided. Let us consider J + 1 units over t = 1, . . . , T, . . . , T′ periods. Without loss of generality, only the first unit is affected uninterruptedly by an intervention in period T during periods T, T + 1, . . . , T′, after an initial pre-intervention period 1, . . . , T − 1. The left over J units are the controls that form the so-called “donor pool”, and they are not affected by the intervention. Let Yjt denote the outcome variable – the variable for which the intervention effect is being measured – of unit j at period t. Y1 jt and Y0 jt denote the outcome of unit j at time t under treatment and in the absence of treatment respectively. We usually do not simultaneously observe both Y1 jt and Y 0 jt, but instead we observe Yjt, which can be written as Yjt = djtY jt + (1− djt)Y jt; where djt is a dummy variable that takes value 1 if unit j is under treatment at time t, and value 0 otherwise (Rubin, 1974). In this case and without loss of generality, only the first unit is under intervention, so we have that djt = { 1 if j = 1 and t ≥ T, 0 otherwise. The treatment or intervention effect for the treated unit can therefore be expressed as α1t = Y1 1t −Y0 1t. Of course, we do not observe Y0 1t for t ≥ T. Thus, the goal of the panel data approach is to obtain an estimate for the effect of the intervention, α̂1t, during the post-treatment period T, . . . , T′ by attempting to replicate the economy of the treated unit in the pre-intervention period 1, . . . , T − 1; that is, by obtaining an estimate of the outcome variable under no treatment Y0 1t. It is assumed that there is no treatment interference between units, i.e., the outcome of the untreated units is not affected by the treatment of the treated unit. The panel data approach developed by Hsiao et al. (2012) attempts to predict Y0 1t for t ≥ T and therefore to estimate the treatment effect α1t by exploiting the dependency among cross-sectional units in the donor pool and the treated unit, using the following modeling strategy: use R2 (or likelihood values) in order to select the best OLS estimator for Y0 1t using j out of the J units in the donor pool, denoted by M(j)∗ for j = 1, . . . , J; then choose M(m)∗ from M(1)∗, . . . , M(J)∗ in terms of a model selection criterion, like AICc, AIC or BIC.1 This strategy is founded on the following underlying model. Hsiao et al. (2012) assume that Y0 it is generated by a dynamic factor model of the form: Y0 jt = γj + ftbj + εit, (1) where γj denotes an individual-specific effect, ft is a (1× K) vector that denotes time varying unobserved common factors, bj denotes a (K× 1) vector of constants that can vary across units, K is the number of common factors, and ε jt is the time varying idiosyncratic component of individual j. Y0 1t could be predicted using the underlying model Hsiao et al. (2012) specify and the assumptions they delineate. Instead, they suggest a more practical approach, i.e., using the remaining non-intervened units in the donor pool Y−1t = (Y2t, . . . , YJt) to predict Y0 1t Y0 1t = α + βY−1t + ε1t. (2) Note that the panel data approach calculates OLS models of up to J + 1 parameters; so that if the length of the pre-treatment period t = 1, 2, . . . , T′ − 1 is not of a much higher order than that, the regressions M(J− 1)∗, M(J)∗ cannot be calculated because there are not enough degrees of freedom. To avoid this problem, we propose the following slight modification to the previously outlined modeling strategy: use R2 in order to select the best OLS estimator for Y0 1t using j out of the J units in the donor pool, denoted by M(j)∗ for j = 1, . . . , T0 − 4; then choose M(m)∗ from M(1)∗, . . . , M(T0 − 4)∗ in terms of a model selection criterion (in our case AICc). Note that the key difference is that while we allowed models up to M(J)∗, this is now modified to allow models up to M(T0 − 4)∗, with T0 − 4 < J.2 To implement the method, the pampe package relies on the use of the regsubsets() function from the leaps package by Lumley (2014). The main user-available function of the pampe package, also called pampe(), calculates all OLS models for Y0 1t as dependent variable and using j out of the J units 1Hsiao et al. (2012) conduct the analysis using both AIC and AICc criteria; in the implementation of the method in the pampe package both criteria plus BIC are included. T0 − 4 is to allow for at least three degrees of freedom. The R Journal Vol. 7/2, December 2015 ISSN 2073-4859 CONTRIBUTED RESEARCH ARTICLES 107 in the donor pool as explanatory variables, denoted by M(j)∗ for j = 1, . . . , J or up to order J′ < J if specified by the user, which would override the default outlined above; then the best one is kept in terms of a model selection criterion (AIC, AICc, or BIC) also specified by the user. In order to perform inference on the results obtained, the package implements the so-called placebo studies procedure outlined in Abadie and Gardeazabal (2003); Abadie et al. (2010) 

An R Package for the Panel Approach Method for Program Evaluation: pampe by Ainhoa Vega-Bayo Abstract The pampe package for R implements the panel data approach method for program evaluation designed to estimate the causal effects of political interventions or treatments.This procedure exploits the dependence among cross-sectional units to construct a counterfactual of the treated unit(s), and it is an appropriate method for research events that occur at an aggregate level like countries or regions and that affect only one or a small number of units.The implementation of the pampe package is illustrated using data from Hong Kong and 24 other units, by examining the economic impact of the political and economic integration of Hong Kong with mainland China in 1997 and2004 respectively.

An introduction to the panel data approach and program evaluation methods
Program evaluation methodologies have long been used by social scientists to measure the effect of different economic or political interventions (treatments).The problem is, of course, that you cannot observe the outcome both under the intervention and in the absence of the intervention simultaneously, hence the need for program evaluation methods.Traditionally, comparative case studies have been the preferred method by researchers in order to compare units affected by a treatment or event (dubbed the treatment group) to one or more units not affected by this intervention (the control group).The idea is to use the outcome of the control group to obtain an approximation of what would have been the outcome of the treated group had it not been treated.In more recent years, synthetic control methods (Abadie and Gardeazabal, 2003;Abadie et al., 2010) have addressed these issues by introducing a data-driven procedure for selecting the control group.However, the synthetic control methods are not without shortcomings: since the synthetic control is calculated as a convex combination of the units in the donor pool, and thus it does not allow for extrapolation, it might be that a suitable synthetic control for our treated unit does not exist.Furthermore, the synthetic control is designed to be used with explanatory variables or covariates that help explain the variance in the outcome variable.For the cases when the researcher finds that extrapolation is needed to obtain a suitable comparison for the treated unit, or when the covariates available do not properly explain the outcome on which the effect of the treatment is intended to be measured, he or she might prefer to use the panel data approach for program evaluation by Hsiao et al. (2012).The panel data approach for constructing the counterfactual of the unit subjected to the intervention is to use other units that are not subject to the treatment to predict what would have happened to the treated unit had it not been subject to the policy intervention.The basic idea behind this approach is to rely on the correlations among cross-sectional units.They attribute the cross-sectional dependence to the presence of common factors that drive all the relevant cross-sectional units.
As such, the aim of this article is to present the package pampe that implements the panel data approach for program evaluation procedures in R, which is available from the Comprehensive R Archive Network (CRAN) at http://CRAN.R-project.org/package=pampe.The main function in the package is pampe(), which computes the counterfactual for the treated unit using the modeling strategy outlined by Hsiao et al. (2012).The function includes an option to obtain placebo tests.There is an additional function robustness(), which conducts a leave-one-out robustness on the results.The data example is also from Hsiao et al. (2012), which introduced the panel data approach methodology to study the effect of the political and economic integration of Hong Kong with mainland China using other countries geographically and economically close to Hong Kong as possible controls.
The article is organized as follows.The following section is a brief overview of the panel data approach as defined by Hsiao et al. (2012).The main section of the paper, titled Implementing pampe in R, demonstrates the implementation of this method and the use of the pampe package with an example, including how to perform inference and robustness checks.

The panel data approach method for program evaluation
The panel data approach for program evaluation exploits the dependence among cross-sectional units to construct a counterfactual of the treated unit(s), to estimate how the affected unit would have developed in the absence of an intervention.The estimated effect of the policy intervention is therefore simply the difference between the actual observed outcome of the treated unit and this estimated The R Journal Vol.7/2, December 2015 ISSN 2073-4859 counterfactual.Hsiao et al. (2012) provide a thorough description of the methodology.Here the focus is on how this method is implemented in the pampe package and thus only a brief overview of the procedure is provided.
Let us consider J + 1 units over t = 1, . . ., T, . . ., T periods.Without loss of generality, only the first unit is affected uninterruptedly by an intervention in period T during periods T, T + 1, . . ., T , after an initial pre-intervention period 1, . . ., T − 1.The left over J units are the controls that form the so-called "donor pool", and they are not affected by the intervention.Let Y jt denote the outcome variable -the variable for which the intervention effect is being measured -of unit j at period t.Y1 jt and Y 0 jt denote the outcome of unit j at time t under treatment and in the absence of treatment respectively.We usually do not simultaneously observe both Y 1 jt and Y 0 jt , but instead we observe Y jt , which can be written as where d jt is a dummy variable that takes value 1 if unit j is under treatment at time t, and value 0 otherwise (Rubin, 1974).In this case and without loss of generality, only the first unit is under intervention, so we have that The treatment or intervention effect for the treated unit can therefore be expressed as Of course, we do not observe Y 0 1t for t ≥ T. Thus, the goal of the panel data approach is to obtain an estimate for the effect of the intervention, α 1t , during the post-treatment period T, . . ., T by attempting to replicate the economy of the treated unit in the pre-intervention period 1, . . ., T − 1; that is, by obtaining an estimate of the outcome variable under no treatment Y 0 1t .It is assumed that there is no treatment interference between units, i.e., the outcome of the untreated units is not affected by the treatment of the treated unit.
The panel data approach developed by Hsiao et al. (2012) attempts to predict Y 0 1t for t ≥ T and therefore to estimate the treatment effect α 1t by exploiting the dependency among cross-sectional units in the donor pool and the treated unit, using the following modeling strategy: use R2 (or likelihood values) in order to select the best OLS estimator for Y 0 1t using j out of the J units in the donor pool, denoted by M(j) * for j = 1, . . ., J; then choose M(m) * from M(1) * , . . ., M(J) * in terms of a model selection criterion, like AICc, AIC or BIC. 1   This strategy is founded on the following underlying model.Hsiao et al. (2012) assume that Y 0 it is generated by a dynamic factor model of the form: where γ j denotes an individual-specific effect, f t is a (1 × K) vector that denotes time varying unobserved common factors, b j denotes a (K × 1) vector of constants that can vary across units, K is the number of common factors, and ε jt is the time varying idiosyncratic component of individual j.
Y 0 1t could be predicted using the underlying model Hsiao et al. (2012) specify and the assumptions they delineate.Instead, they suggest a more practical approach, i.e., using the remaining non-intervened units in the donor pool Note that the panel data approach calculates OLS models of up to J + 1 parameters; so that if the length of the pre-treatment period t = 1, 2, . . ., T − 1 is not of a much higher order than that, the regressions M(J − 1) * , M(J) * cannot be calculated because there are not enough degrees of freedom.To avoid this problem, we propose the following slight modification to the previously outlined modeling strategy: use R 2 in order to select the best OLS estimator for Y 0 1t using j out of the J units in the donor pool, denoted by M(j) * for j = 1, . . ., T 0 − 4; then choose M(m) * from M(1) * , . . ., M(T 0 − 4) * in terms of a model selection criterion (in our case AICc).Note that the key difference is that while we allowed models up to M(J) * , this is now modified to allow models up to M(T 0 − 4) * , with T 0 − 4 < J. 2  To implement the method, the pampe package relies on the use of the regsubsets() function from the leaps package by Lumley (2014).The main user-available function of the pampe package, also called pampe(), calculates all OLS models for Y 0 1t as dependent variable and using j out of the J units in the donor pool as explanatory variables, denoted by M(j) * for j = 1, . . ., J or up to order J < J if specified by the user, which would override the default outlined above; then the best one is kept in terms of a model selection criterion (AIC, AICc, or BIC) also specified by the user.
In order to perform inference on the results obtained, the package implements the so-called placebo studies procedure outlined in Abadie and Gardeazabal (2003); Abadie et al. (2010) and Abadie et al. (2015). 3The basic idea behind the placebo studies is to iterate the application of the panel data approach by reassigning the treatment to other non-treated units, i.e., to the controls in the donor pool; or by reassigning the treatment to other pre-intervention periods, when the treatment had yet to occur.The set of placebo effects can therefore be compared to the effect that was estimated for the "real" time and unit, in order to evaluate whether the effect estimated by the panel data approach when and where the treatment actually occurred is large relative to the placebo effects.

Implementing pampe in R
This section expands on the implementation of the method itself as well as the placebo studies and how they can be interpreted by the user by means of two examples: the political and economic integration of Hong Kong with mainland China in 1997 and 2004, plus the reassignation of the treatment to other units in the control group and different pre-treatment dates.Hsiao et al. (2012) use a combination of other countries to construct a counterfactual for Hong Kong that resembled the economy prior to the political and economic integration.The growth dataset, obtained from the supplemental materials of Hsiao et al. (2012) contains information on the quarterly real GDP growth rate of 24 countries (the donor pool) and Hong Kong from 1993 Q1 to 2008 Q1, computed as the change with respect to the same quarter in the previous year.

> library("pampe") > data("growth")
The data is organized in standard cross-sectional data format, with the variables (the quarterly real GDP growth rate of the countries in the donor pool act as explanatory variables) extending across the columns and the quarters (time-periods) extending across rows.It is important for the user to have his or her data in this standard format to correctly apply the methodology.Naming the rows and especially the columns is also strongly recommended though not required.
If the user does not have the data in standard wide format, the pampe package also includes an optional pampeData function that prepares the data according to the required format.It also helps reshape the data in case the user has it in long format.This function should be run prior to the pampe function.For example, if we had a dataset in long format such as the Produc dataset from the plm package: > library("plm") > data(Produc, package = "plm") > long.data<-plm.data(Produc)> head(long.data)state year pcap hwy water util pc gsp emp unemp 1 ALABAMA 197015032.67 7325.80 1655.68 6051.20 35793.80 28418 1010.5 4.7 2 ALABAMA 197115501.94 7525.94 1721.02 6254.98 37299.91 29375 1021.9 5.2 3 ALABAMA 197215972.41 7765.42 1764.75 6442.23 38670.30 31303 1072.3 4.7 4 ALABAMA 197316406.26 7907.66 1742.41 6756.19 40084.01 33430 1135.5 3.9 5 ALABAMA 197416762.67 8025.52 1734.85 7002.29 42057.31 33749 1169.8 5.5 6 ALABAMA 197517316.26 8158.23 1752.27 7405.76 43971.71 33604 1155.4 7.7 Note that "year" is the name of the time index and "state" is the id index, and that there is data on eight variables.If we want to keep data on the variable "pcap" and transform that into a wide format, in which "year" spans across rows and "state" across columns, we can use the following call to pampeData for that.

time.pretr
The pre-treatment periods, up to the introduction of the treatment.For example, if you have 10 pre-treatment periods and the treatment was introduced in the 11th period, this argument could be 1:10.

time.tr
The treatment period plus the periods for which you want to check the effect.For example, if the treatment was introduced in the 11th period and you want results for 9 more periods, it would be 11:20.

treated
The treated unit, i.e., the unit that receives the intervention.It can be a name or the index of the column if the columns in the data matrix are named (which is recommended).
Having introduced the optional data preparation, let us now continue with the main function and example of this paper, the growth dataset and the pampe function.Observe how the wide.dataabove is in an equivalent format as the growth data below after having applied the pampeData function.
> growth[1:10, 1:5] In this example, the treated unit -Hong Kong -is in the first column, while the 24 non-treated units are in columns 2 to 25; and the time-periods (quarters) are in rows.Note how both the rows and the columns are named for ease of use and interpretation.

Using the function pampe()
Once the data is in the correct format, it is just a matter of applying the pampe() command to the dataset.Note that it requires a balanced dataset, i.e., no missing values are allowed. 4As the bare minimum, the command requires the arguments specified in Table 1.
No additional arguments are necessary, though one may choose to pass other arguments as well.Setting the controls argument is especially recommended, otherwise the default is to use all the remaining (non-treated) columns in the dataset as controls.For example, let us run the pampe() functions using only the bare-bones arguments for the economic integration of Hong Kong as carried out by Hsiao et al. (2012).We first set the pre-treatment and treatment periods; the economic integration of Hong Kong happened in 2004Q1.The pre-treatment period therefore ranges from 1993Q1 to 2003Q4, and the treatment and post-treatment goes from 2004Q1 to 2008Q1.It is useful to define the periods objects before calling the function so that you can use them later when processing the results, although inputting them directly into the function call is of course an option.
> time.pretr<-c("1993Q1", "2003Q4") > time.tr<-c("2004Q1", "2008Q1") > ## Or if you know the row indexes use those directly, e.g.> time.tr<-45:61 > ## The treated unit > treated <-"HongKong" > ## Call the function with the bare minimum arguments specified > econ.integ<-pampe(time.pretr= time.pretr,time.tr= time.tr,+ treated = treated, data = growth) This replicates the results obtained by Hsiao et al. (2012) for the economic integration of Hong Kong using all 24 units as the possible controls, and the AICc selection criterion.That is, the counterfactual for Hong Kong is built as a combination of Austria, Italy, Korea, Mexico, Norway, and Singapore; i.e., those are the countries that best replicate the economy of Hong Kong in the period prior to the economic integration, according to the model by Hsiao et al. (2012).
If one wishes to fine-tune the process, the arguments specified in Table 2 can be passed to the function as well. 5or example, let us now try to replicate the results obtained by Hsiao et al. (2012) for the political integration of Hong Kong using the AICc criterion.If one were not to specify the set of possible controls as they do, the function would aimt to use all controls.Since those are too many given the pre-treatment period and the user has not specified a custom nvmax, the default nvmax setting would switch to the length of the pre-treatment period minus four to allow for at least three degrees of freedom.
The treated unit remains the same, but the pre-treatment and treatments are different.If we call the function with the AICc criterion, and use the default ("All") for the controls: > treated <-"HongKong" > time.pretr<-1:18 # 1993Q1-1997Q2 > time.tr<-19:44 # 1997Q3-2003Q4 > pol.integ.all<-pampe(time.pretr= time.pretr,time.tr= time.tr,+ treated = treated, data = growth, select = "AICc") controls The units used as controls to calculate the counterfactual, that have not received the treatment.By default, all the remaining (after removing the treated) columns in the data matrix are included as columns, but specific controls can be specified using their column name, e.g., c("Australia", "Austria", "Canada"), or their column index, e.g., 2:4.

nbest
The original method by Hsiao et al. (2012) specifies to keep the best model in terms of R 2 for each M(j), hence the default of this argument is one.However the user might choose to keep the best 2, 3, . . .before moving on to the second step of the method by changing the default of this argument.

nvmax
Indicates how many subsets of controls should the method check in the first step of the model.The original method by Hsiao et al. (2012) checks all subsets up to the biggest size, M(j) * up to M(J) * and hence the default (nvmax = J); but if the pre-treatment period is too short such that this might not be possible, the slight modification if checking subsets up to T 0 − 4 is proposed and this is the alternative default the method takes if J is too big for the pre-treatment period and the user has not specified an alternative nvmax.If the user-specified nvmax is too big, it will throw out an error indicating to change this argument or reduce the number of controls. select The model selection criterion for the second step of the method.In the original article they propose either AICc (default) or AIC.The user can choose between those two or BIC as well.That is, the call does not throw out an error because of the slightly modified modeling strategy specified in the previous section, but this does not replicate the results obtained by Hsiao et al. (2012) and might not be what the user wants.
The R Journal Vol.7/2, December 2015 ISSN 2073-4859 In this case, to replicate their results, the set of possible controls has to be specified first, and then we call the function: > possible.ctrls<-c("China", "Indonesia", "Japan", "Korea", "Malaysia", + "Philippines", "Singapore", "Taiwan", "UnitedStates", "Thailand") > pol.integ <-pampe(time.pretr= time.This replicates their results.Note that this output suggests that out of the pre-selected potential controls (China, Indonesia, Japan, Korea, Malaysia, Philippines, Singapore, Taiwan, United States, and Thailand) the model suggests that a combination of Japan, Korea, United States, and Taiwan is the optimal one to replicate the economy of Hong Kong in the pre-economic integration period, while discarding the remaining countries.

data
The data used for the estimation, stored for later use in, for example, the robustness function also included in the package, which is explained later on.

Obtaining and transmitting results
Once the function has been correctly run and the user is satisfied with the model, the next step is to process the results obtained.The pampe() function returns an object of class 'pampe' with the objects specified in Table 3.6 Continuing with the example of the political integration of Hong Kong with the AICc criterion and the set of possible controls as specified by Hsiao et al. (2012): > ## Setup > treated <-"HongKong" > time.pretr<-1:18 # 1993Q1-1997Q2 > time.tr<-19:44 # 1997Q3-2003Q4 > possible.ctrls<-c("China", "Indonesia", "Japan", "Korea", "Malaysia", + "Philippines", "Singapore", "Taiwan", "UnitedStates", "Thailand") > ## Call the function with AICc and the possible controls > pol.integ <-pampe(time.pretr= time.Although not directly printed, the summary() method for objects of class 'pampe' also includes the actual and counterfactual paths which can be accessed via summary(pol.integ)$res.table.We can also manipulate pol.integ$model as we wish with the usual methods since it is an object of class 'lm'.For example, summary(pol.integ$model)can be used to obtain a summary, residuals(pol.integ$model) for the residuals, or fitted(pol.integ$model) to recover the estimated values.
Another method included in the package is plot().It works on objects of class 'pampe' to produce a plot of the actual evaluation of the treated unit together with the predicted counterfactual path.A simple plot call to our saved 'pampe' object, plot(pol.integ),would produce Figure 2.

Actual and Counterfactual Path
To obtain a plot of the estimated treatment effect, we first calculate the treatment effect, which is the difference between the actual and predicted (counterfactual) path; then we plot it (see Figure 4).Note that if the method works well to replicate the economy in the pre-treatment period, the treatment effect should be around zero in the pre-treatment period.), las = 3) > title(xlab = "Quarter", mgp = c(3.6,0.5, 0)) > ## Legend > legend("topleft", "Treatment Effect", col = 1, lty = 1, lwd = 2) > ## Add a vertical line when the treatment starts > abline (v = time.pretr[length(time.pretr)],lty = 3, lwd = 2) > ## Horizontal line on zero > abline(h = 0, lty = 3, lwd = 2) The user might also be interested in exporting tables that show the results of the procedure, to be used in a L A T E X document.Simply manipulating the data and using xtable from the package of the same name xtable (Dahl, 2014) one can obtain the tables shown in Hsiao et al. (2012).An xtable method is also included for this purpose, which requires the output of the 'pampe' object and the user specifying which table type he or she wants: the table of the model or the treatment table, which includes the actual, predicted, and treatment paths.

Placebo tests
In order to perform inference on the results obtained, the package implements the so-called placebo studies procedure outlined in Abadie and Gardeazabal (2003); Abadie et al. (2010) and Abadie et al. (2015).The idea is to iterate the application of the panel data approach by reassigning the treatment to other non-treated units, i.e., to the controls in the donor pool; or by reassigning the treatment to other pre-intervention periods, when the treatment had yet to occur.The set of placebo effects can therefore be compared to the effect that was estimated for the "real" time and unit, in order to evaluate whether The R Journal Vol.7/2, December 2015 ISSN 2073-4859

placebo.ctrls
A list which includes another two objects: $mspe and $tr.effect.The first includes the mspe for the pre-treatment period (time.pretr)and the second is the estimated treatment effect for the treated unit in the first column and for the countries in the original donor pool (possible.ctrls) in the remaining columns.

placebo.time
The same as placebo.ctrlbut with the reassignment of the treatment in time, to periods in the pre-treatment period (the reassignment is from half of the pre-treatment period until the period previous to the actual treatment).the effect estimated by the panel data approach when and where the treatment actually occurred is large relative to the placebo effects.
The function pampe() has both placebo studies (placebo-controls and placebos-in-time) built in.Thus the user can obtain the results from the placebo studies and perform this type of statistical inference simply by switching the last argument of the function pampe(), placebos, from the default FALSE to either "controls", "time", or "Both".Continuing the previous example, the call to the function is identical to the pol.integ one except that now we also ask for the placebos.The other arguments are inherited from before.
For example, if we take a look at the first five rows and columns of pol.integ.placebos$placebo.time$tr.effect:1997Q3 1995Q3 1995Q4 1996Q1 1996Q2 1993Q1 0.007000500 -0.0084630055 0.0005825015 0.0005170915 3.642217e-05 1993Q2 -0.001831535 0.0046961110 0.0036438429 0.0009914089 -6.703595e-04 1993Q3 -0.007021179 0.0005439412 -0.0035785257 -0.0003308651 -2.804963e-03 1993Q4 0.000975869 -0.0065735180 -0.0041505788 -0.0021485760 3.376669e-03 1994Q1 0.005045203 0.0081985975 0.0036386090 0.0003446360 1.552006e-04 We can see that it is a table with the estimated treatment effects (difference between actual and predicted); the first column shows the actual treatment effect, whereas in the remaining columns we have the estimated treatment effect after having reassigned the treatment to other periods, specified in the column name.In this case, the second column has the treatment reassigned to 1995Q3.Now these additional results can be used for plots and to check whether the treatment effect is significant.When the saved 'pampe' object has placebo studies stored inside, a plot() call to the 'pampe' object will produce the placebo plot(s) as well as the initial actual/predicted path plot.The placebo plot for the control reassignment is given in Figure 5.
As an example of what a significant treatment effect would look like, we carry out the treatmentreassignment placebo tests for the economic integration of Hong Kong, which Hsiao et al. (2012) find to be significant.

Robustness checks
Besides placebo studies for inference tests, Abadie et al. (2015) show the importance of running robustness checks on the results obtained.This section demonstrates how to implement the so-called leave-one-out robustness check, which iteratively removes one of the units in the control group of the final model, to check whether the results are driven by one unit in particular or, in contrast, the results are robust to removing one unit.
The leave-one-out robustness check can be applied using the other user-available function included in the package, robustness().After applying the initial function pampe(), the user can carry out the robustness check simply by calling the function robustness() and specifying the name of the saved object from the pampe call.

> rob.check <-robustness(pol.integ)
This produces a matrix with the actual path, the initial predicted path, and each leave-one-out predicted path.Its first five rows are given by: The user can then plot this robustness check by calling the plot() method to the saved object.xtable and summary methods are also provided.

plot(rob.check)
The following plot (Figure 8) shows that the results obtained with four countries (Japan, Korea, US and Taiwan) are robust to the removal of one of them.That is, the results are not driven by one particular country.
If the user would prefer to reproduce the plot and manipulate the code to his or her liking, they should modify the following code, which replicates the above plot.

Figure 3 :
Figure 3: Plot of the estimated treatment effect.

Table 1 :
Necessary arguments for the pampe function.
Notice that the defaults, which are used in this case, are to use all the controls and the AICc criterion.You can print out a summary of the optimal model:

Table 2 :
Other non-required arguments for the pampe function.
squared: 0.9982, Adjusted R-squared: 0.9962 F-statistic: 498.3 on 9 and 8 DF, p-value: 5.029e-10 pretr, time.tr= time.tr,treated = treated, + controls = possible.ctrls,data = growth) Thus, the user can play around with the controls, nbest, nvmax and select arguments such that they better suit their needs and their particular dataset.As the bare minimum, time.pretr,time.tr,treatedanddata are required for the function to run.An object of class 'lm' with the optimal model.Usual methods such as fitted(), residuals(), summary(), etc. can be used on it.counterfactualAnamed matrix of the actual path together with the path of the estimated counterfactual for the time.pretrand time.trperiods. model

Table 3 :
Results given by the pampe function.
summary() method is included for objects of class 'pampe', with useful information for the researcher: A

Table 4 :
Additional results from the pampe function.