Temporal disaggregation methods are used to disaggregate low frequency time series to higher frequency series, where either the sum, the average, the first or the last value of the resulting high frequency series is consistent with the low frequency series. Temporal disaggregation can be performed with or without one or more high frequency indicator series. The package tempdisagg is a collection of several methods for temporal disaggregation.
Not having a time series at the desired frequency is a common problem for researchers and analysts. For example, instead of quarterly sales, they only have annual sales. Instead of a daily stock market index, they only have a weekly index. While there is no way to fully make up for the missing data, there are useful workarounds: with the help of one or more high frequency indicator series, the low frequency series may be disaggregated into a high frequency series. For example, quarterly exports could help disaggregating annual sales, and a foreign stock market index could help disaggregating the stock market index at home.
Even when there is no high frequency indicator series, one still may want to disaggregate a low frequency series. While the accuracy of the resulting high frequency series will be low, it may still be worth doing so. For example, estimating a vector-autoregressive model requires all variables to have the same frequency. Having one bad high frequency series could still be preferable to the switch to a lower frequency.
The package tempdisagg (Sax and Steiner 2013) implements the following standard methods for temporal disaggregation: Denton, Denton-Cholette, Chow-Lin, Fernandez and Litterman. On the one hand, Denton (Denton 1971) and Denton-Cholette (e.g. Dagum and Cholette 2006) are primarily concerned with movement preservation, generating a series that is similar to the indicator series whether or not the indicator is correlated with the low frequency series. Alternatively, these methods can disaggregate a series without an indicator. On the other hand, Chow-Lin, Fernandez and Litterman use one or several indicators and perform a regression on the low frequency series. Chow-Lin (Chow and Lin 1971) is suited for stationary or cointegrated series, while Fernandez (Fernández 1981) and Litterman (Litterman 1983) deal with non-cointegrated series.
All disaggregation methods ensure that either the sum, the average, the first or the last value of the resulting high frequency series is consistent with the low frequency series. They can deal with situations where the high frequency is an integer multiple of the low frequency (e.g. years to quarters, weeks to days), but not with irregular frequencies (e.g. weeks to months).
Temporal disaggregation methods are widely used in official statistics.
For example, in France, Italy and other European countries, quarterly
figures of Gross Domestic Product (GDP) are computed using
disaggregation methods. Outside of R, there are several software
packages to perform temporal disaggregation: Ecotrim by Barcellan et al. (2003);
a Matlab extension by Quilis (2012); and a RATS extension by Doan (2008). An
overview of the capabilities of the different software programs is given
in Table 1.
The first section discusses the standard methods for temporal disaggregation and summarizes them in a unifying framework. Section 2 discusses the working and implementation of the tempdisagg package. Section 3 presents an illustrative example.
The aim of temporal disaggregation is to find an unknown high frequency
series
Methods | Ecotrim | Matlab add-on | RATS add-on | tempdisagg |
---|---|---|---|---|
Chow-Lin (max. log) | erroneous | full | full | full |
Chow-Lin (min. RSS) | full | full | no | full |
Fernández | full | full | full | full |
Litterman (max. log) | erroneous | full | full | full |
Litterman (min. RSS) | full | full | no | full |
Denton-Cholette | partial | erroneous | no | full |
Denton | no | partial | no | full |
Contemporaneous constraint | partial | partial | no | no |
The diversity of temporal disaggregation methods can be narrowed by
putting the methods in a two-step framework: First, a preliminary
quarterly series
Equation (1) constitutes a unifying framework for all
disaggregation methods. The methods differ in how they determine the
preliminary series,
The methods of Denton and Denton-Cholette use a single indicator
as their preliminary series:
The regression-based methods Chow-Lin, Fernandez and
Litterman perform a Generalized Least Squares Regression (GLS) of
the annual values,
With the exception of Denton-Cholette, the distribution matrix of all
temporal disaggregation methods is a function of the variance-covariance
matrix,
The Denton methods minimize the squared absolute or relative
deviations from a (differenced) indicator series, where the parameter
For the additive Denton method with
Denton-Cholette is a modification of the original approach and
removes the spurious transient movement at the beginning of the
resulting series. While generally preferable, the calculation of the
distribution matrix,
Chow-Lin assumes that the quarterly residuals follow an
autoregressive process of order 1 (AR1), i.e.,
Methods | |||
---|---|---|---|
denton |
|||
denton-cholette |
|||
chow-lin-maxlog , chow-lin-minrss-ecotrim , |
|||
chow-lin-minrss-quilis |
|||
litterman-maxlog , litterman-minrss |
|||
fernandez |
The remaining methods deal with cases when the quarterly indicators and
the annual series are not cointegrated. Fernandez and Litterman
assume that the quarterly residuals follow a non-stationary process,
i.e.
There are several ways to estimate the autoregressive parameter
In a different approach, Bournay and Laroque (1979 23) suggest the
maximization of the likelihood of the GLS-regression:
A final approach is the minimization of the weighted residual sum of squares, as it has been suggested by Barbone et al. (1981):
The selection of a temporal disaggregation model is similar to the
selection of a linear regression model. Thus, td
, the main function of
the package, closely mirrors the working of the lm
function (package
stats), including taking advantage of the formula
interface.data
argument in td
, however. Because td
is
working with series of different length and frequencies, it is not
possible to combine them in a single "data.frame"
.
td(formula, conversion = "sum", to = "quarterly", method = "chow-lin-maxlog",
truncated.rho = 0, fixed.rho = 0.5, criterion = "proportional", h = 1,
start = NULL, end = NULL, ...)
The left hand side of the formula denotes the low frequency series, the
right hand side the indicators. If no indicator is specified, the right
hand side must be set equal to 1
. The variables can be entered as time
series objects of class "ts"
or as standard vectors or matrices. If
entered as "ts"
objects, the resulting series will be "ts"
objects
as well.
The conversion
argument indicates whether the low frequency values are
sums, averages, first or last values of the high frequency values
("sum"
(default), "average"
, "first"
or "last"
, respectively).
The method
argument indicates the method of temporal disaggregation,
as shown in Table 2 (see ?td
for a complete listing
of methods). The to
argument indicates the high frequency destination
as a character string ("quarterly"
(default) or "monthly"
) or as a
scalar (e.g. 2
, 7
, for year-semester or week-day conversion). It is
only required if no indicator series is specified (Denton methods), or
if standard vectors are used instead of time series objects. Finally,
you can set an optional start or end date. This is identical to
pre-processing the input series with window
.
td
returns an object of class "td"
. The function predict
computes
the disaggregated high frequency series,
The implementation of tempdisagg follows the same notation and modular
structure as the exposure in the previous section. Internally, td
uses
the optimize
function (package stats) to solve the one-dimensional
optimization problem at the core of the Chow-Lin and Litterman methods.
For GLS estimation, td
uses an efficient and nummerically stable
algorithm that is based on the qr-decomposition (Paige 1979).
Suppose we have an annual series and want to create quarterly values
that sum up to the annual values. Panel 1 of Fig. 1
depicts annual sales of the pharmaceutical and chemical industry in
Switzerland, sales.a
, from which we want to create a quarterly series.
The following example demonstrates the basic use of tempdisagg. It can
also be run by demo(tempdisagg)
.
The most simple method is "denton-cholette"
without an indicator
series. It performs a simple interpolation that meets the temporal
additivity constraint. In R, this can be done the following way:
> library(tempdisagg)
> data(swisspharma)
> m1 <- td(sales.a ~ 1, to = "quarterly", method = "denton-cholette")
> predict(m1)
td
produces an object of class "td"
. The formula, sales.a ~ 1
,
indicates that our low frequency variable, sales.a
, will be
disaggregated with a constant, 1
(see ?formula
for the handling of
the intercept in the formula interface). The resulting quarterly values
of sales can be extracted with the predict
function. As there is no
additional information on quarterly movements, the resulting series is
very smooth (Panel 2 of Fig. 1).
While this purely mathematical approach is easy to perform and does not
need any other data series, the economic value of the resulting series
may be limited. There might be a related quarterly series that follows a
similar movement than sales. For example, we may use quarterly exports
of pharmaceutical and chemical products, exports.q
(Panel 3 of
Fig. 1):
> m2 <- td(sales.a ~ 0 + exports.q, method = "denton-cholette")
Because we cannot use more than one indicator with the
"denton-cholette"
(or "denton"
) method, the intercept must be
specified as missing in the formula (0
). Contrary to the first
example, the to
argument is redundant, because the destination
frequency can be interfered from the time series properties of
exports.q
. Applying the predict
function to the resulting model
leads to a much more interesting series, as shown in Panel 4 of
Fig. 1. As the indicator series is longer than the
annual series, there is an extrapolation period, in which quarterly
sales are forecasted.
With an indicator, the "denton-cholette"
method simply transfers the
movement of the indicator to the resulting series. Even if in fact there
were no correlation between the two series, there would be a strong
similarity between the indicator and the resulting series. In contrast,
regression based methods transfer the movement only if the indicator
series and the resulting series are actually correlated on the annual
level. For example, a Chow-Lin regression of the same problem as above
can be performed the following way:
> m3 <- td(sales.a ~ exports.q)
As "chow-lin-maxlog"
is the default method, it does not need to be
specified. Like with the corresponding lm
method, summary
produces
an overview of the regression:
> summary(m3)
Call:
td(formula = sales.a ~ exports.q)
Residuals:
Min 1Q Median 3Q Max
-77.892 -7.711 -4.628 9.647 36.448
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.241e+01 1.493e+00 8.311 1.06e-09 ***
exports.q 1.339e-02 1.672e-04 80.111 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
'chow-lin-maxlog' disaggregation with 'sum' conversion
36 low-freq. obs. converted to 146 high-freq. obs.
Adjusted R-squared: 0.9946 AR1-Parameter: 0 (truncated)
There is indeed a strong correlation between exports and sales, as it
has been assumed in the "denton-cholette"
example above. The
coefficient of exports.q
is highly significant, and the very high
adjusted predict
function, we can extract the resulting
quarterly series of sales (Panel 5 of Fig. 1). Like all
regression based methods, "chow-lin-maxlog"
can also be used with more
than one indicator series:
> m4 <- td(formula = sales.a ~ exports.q + imports.q)
In our example, we actually know the true data on quarterly sales, so we can compare the estimated values to the true values. With an indicator series, both the Denton method and Chow-Lin produce a series that is close to the true series (Panel 6 of Fig. 1). This is, of course, due to fact that in this example, exports are a good indicator for sales. If the indicator is less close to the series of interest, the resulting series will be less close to the true series.
tempdisagg implements the standard methods for temporal disaggregation. It offers a way to disaggregate a low frequency time series into a higher frequency series, while either the sum, the average, the first or the last value of the resulting high frequency series is consistent with the low frequency series. Temporal disaggregation can be performed with or without the help of one or more high frequency indicators. If good indicators are at hand, the resulting series may be close to the true series.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
data
argument in td
, however. Because td
is
working with series of different length and frequencies, it is not
possible to combine them in a single "data.frame"
.[↩]Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Sax & Steiner, "Temporal Disaggregation of Time Series", The R Journal, 2013
BibTeX citation
@article{RJ-2013-028, author = {Sax, Christoph and Steiner, Peter}, title = {Temporal Disaggregation of Time Series}, journal = {The R Journal}, year = {2013}, note = {https://rjournal.github.io/}, volume = {5}, issue = {2}, issn = {2073-4859}, pages = {80-87} }