CONTRIBUTED ARTICLE 1 ftsa: An R Package for Analyzing Functional Time Series

Recent advances in computer recording and storing technology have tremendously increased the presence of functional data, whose graphical representation can be infinite-dimensional curve, image, or shape. When the same functional object is observed over a period of time, such data are known as functional time series. This article makes first attempt to describe several techniques (centered around functional principal component analysis) for modeling and forecasting functional time series from a computational aspect, using a readily-available R addon package. These methods are demonstrated using age-specific Australian fertility rate data from


Introduction
The aim of this article is to describe the R functions that are readily-available in the ftsa package (Hyndman and Shang, 2012), for modeling and forecasting functional time series.This article was motivated by recent advances in computer recording and storing technology that have enabled researchers to collect and store (ultra) high-dimensional data.When the high-dimensional data are repeatedly measured on the same object over a period of time, a time series of continuous functions is observed within a common bounded interval (Shen and Huang, 2008).
Analyzing functional time series has received increasing attention in the functional data analysis literature (see for example, Hörmann and Kokoszka, 2010;Horváth et al., 2010;Horváth and Kokoszka, 2012).Hyndman and Shang (2010) presented a rainbow plot for visualizing functional time series, where the distant past data are shown in red and most recent data are shown in purple.Aguilera et al. (1999) proposed functional principal component regression (FPCR) to model and forecast functional time series.
Before reviewing the FPCR, we first define the problem more precisely.Let y t (x) denote a function, such as age-specific fertility rates for the continuous age variable x in year t, or monthly sea surface temperatures for the continuous time variable x in year t.In the latter example, functional time series techniques allow us to capture the underlying dynamic of the multiple seasonality in the data (see Shang and Hyndman, 2011;Shang, 2012b, for example).We assume that there is an underlying smooth function f t (x) that observes with error at discretized grid points of x.
In practice, we observe {x i , y t (x i )} for t = 1, 2, . . ., n and i = 1, 2, . . ., p, from which we extract a smooth function f t (x), given by where ε t,i is an independent and identically distributed (iid) standard normal random variable, σ t (x i ) allows the amount of noise to vary with x i , and {x 1 , x 2 , . . ., we are interested in finding underlying patterns using the FPCR, from which we obtain forecasts of y n+h (x), where h denotes the forecast horizon.This article proceeds as follows.Techniques for modeling and forecasting functional time series are reviewed and their implementations using the ftsa package are described.Conclusions are given at the end.

Functional principal component regression
The theoretical, methodological and practical aspects of functional principal component analysis (FPCA) have been extensively studied in the functional data analysis literature, since it allows finite dimensional analysis of a problem that is intrinsically infinitedimensional (Hall and Hosseini-Nasab, 2006).Numerous examples of using FPCA as an estimation tool in regression problem can be found in different fields of applications, such as breast cancer mortality rate modeling and forecasting (Erbas et al., 2007), call volume forecasting (Shen and Huang, 2008), climate forecasting (Shang and Hyndman, 2011), demographical modeling and forecasting (Hyndman and Shang, 2009), and electricity demand forecasting (Antoch et al., 2008).
At a population level, a stochastic process denoted by f can be decomposed into the mean function and the sum of the products of orthogonal functional principal components and uncorrelated principal component scores.It can be expressed as where µ is the unobservable population mean function, β k is the k th principal component scores, and φ k is the k th population functional principal component.
In practice, we can only observe n realizations of f evaluated on a compact interval x ∈ [0, τ], denoted by f t (x) for t = 1, 2, . . ., n.At a sample level, the The R Journal Vol.X/Y, Month, Year ISSN 2073-4859 functional principal component decomposition can be written as where f (x) = 1 n ∑ n t=1 f t (x) is the estimated mean function, φ k (x) is the k th estimated orthonormal eigenfunction of the empirical covariance operator the coefficient β t,k is the k th principal component score for year t, it is given by the projection of is the residual, and K is the opti- mal number of components, which can be chosen by cross validation.Hyndman and Booth (2008) studied the impact on forecast accuracy with a smaller or larger than the optimal value of K.
The functional principal component decomposition is first demonstrated using the age-specific Australian fertility rate data between ages 15 and 49 observed from 1921 to 2006.This data set was obtained from the Australian Bureau of Statistics (Cat No,3105.0.65.001,Table 38).A functional graphical display is given in Shang (2011).
Figure 1 presents the first two functional principal components and their associated principal component scores.The bottom panel of Figure 1 also plots the forecasted principal component scores, and their 80% prediction intervals (in yellow color), using an exponential smoothing state-space model (Hyndman et al., 2008).As pointed out by a referee, the forecasts of principal component scores appear to quickly level off, and the prediction intervals widen very quickly.This reflects the difficulty of our model in forecasting medium or long term horizon, as a result of the increase in variability.# load the package used throughout this article library("ftsa") # Fit and plot functional principal components # order specifies the number of principal components # h specifies the forecast horizon plot(forecast(ftsm(Australiasmoothfertility, order=2), h=20), "components") By conditioning on the observed data f (x) = [ f 1 (x), f 2 (x), . . ., f n (x)] and the fixed functional principal components B = [ φ 1 (x), φ 2 (x), . . ., φ K (x)] , the h-step-ahead forecasts of y n+h (x) can be obtained as where β n+h|n,k denotes the h-step-ahead forecasts of β n+h,k using a univariate time series.
Figure 2 shows the forecasts of Australian fertility rate data from 2007 to 2026 highlighted in rainbow color, while the data used for estimation are grayed out.Both the multi-step-ahead and iterative one-stepahead forecasts exhibit a continuing shift to older ages of peak fertility rates, caused by a recent tendency to postpone child-bearing while pursuing careers.To construct prediction interval, we calculate the forecast variance that follows from (1) and ( 2).Because of orthogonality, the forecast variance can be approximated by the sum of component variances where u n+h,k = Var(β n+h,k |β 1,k , β 2,k , . . ., β n,k ) can be obtained from the time series model, and the model error variance v(x) is estimated by averaging { ε 2 1 (x), ε 2 2 (x), . . ., ε 2 n (x)} for each x, and σ 2 µ (x) and σ 2 n+h (x) can be obtained from the nonparametric smoothing method used.
Based on the normality assumption, the 100(1 − α)% prediction interval for y n+h (x) is constructed as y n+h|n (x) ± z α ξ n+h (x), where z α is the (1 − α/2) standard normal quantile.Figure 3 displays the forecasts of fertility rates in 2007, along with the 80% prediction interval.It was created by the following code.

Updating point and interval forecasts
A special case of functional time series is when the continuous variable is also a time variable, such as the monthly sea surface temperature data from 1950 to 2011, obtained from National Oceanic and Atmospheric Administration (http://www.cpc.noaa.gov/data/indices/sstoi.indices).A similar type of functional graphical display is given in Shang (2011, Figure 2).Such data originate from a univariate seasonal time series.Let {Z w , w ∈ [1, N]} be a seasonal time series which has been observed at N equispaced times.We divide the observed time series into n trajectories, and then consider each trajectory of length p as a curve rather than p distinct data points.The functional time series is given by The problem of interest is to forecast y n+h (x), where h denotes forecast horizon.In the sea surface temperature data, we consider {Z w } to be monthly sea surface temperatures from 1950 to 2011, so that p = 12 and N = 62 × 12 = 744, and we are interested in forecasting sea surface temperatures in 2012 and beyond.
When N = np, all trajectories are complete, and forecasts can be obtained by the FPCR.However, when N = np, we revisited the block moving (BM) and penalized least squares (PLS) proposed by Shang and Hyndman (2011) to update point and interval forecasts, when the most recent curve is partially observed.
When functional time series are segments of a univariate time series, the most recent trajectory is observed sequentially (Hyndman and Shang, 2010).
When we observe the first m 0 time period of y n+1 (x l ), denoted by y n+1 (x e ) = [y n+1 (x 1 ), y n+1 (x 2 ), . . ., y n+1 (x m 0 )] , we are interested in forecasting the data in the remaining time period, denoted by y n+1 (x l ) for m 0 < l ≤ p.By using the FPCR, the partially observed data in the most recent curve are not incorporated into the forecasts of The R Journal Vol.X/Y, Month, Year ISSN 2073-4859 y n+1 (x l ).Indeed, the point forecasts obtained from the FPCR can be expressed as for m 0 < l ≤ p, where f (x l ) denotes the historical data corresponding to the remaining time periods; f (x l ) is the mean function corresponding to the remaining time periods; and B l = { φ 1 (x l ), φ 2 (x l ), . . ., φ K (x l )} is a set of the estimated functional principal components corresponding to the remaining time periods.
In order to improve point forecast accuracy, it is desirable to dynamically update the point and interval forecasts for the rest of year n + 1 by incorporating the partially observed data.In what follows, I shall revisit two methods for updating point and interval forecasts.

Block moving (BM)
The BM method re-defines the start and end points of trajectories.Because time is a continuous variable, we can change the function support from [1, p] to . The re-defined functional time series forms a complete block, at the cost of losing some observations in the first year.With the complete data block, the FPCR can then be applied to update the point and interval forecasts.

Penalized least squares (PLS)
We can also update the remaining part of the trajectory by using regression-based approaches.Let F e be m 0 × K matrix, whose (j, k) th entry is The β n+1 can be estimated by ordinary least squares, assuming (F e F e ) is invertible, However, if (F e F e ) is not invertible, then a regularized approach can be implemented, such as the ridge regression (RR) and penalized least squares (PLS).
The regression coefficients of the RR and PLS are where β RR n+1 → 0 as λ → ∞, and β RR n+1 → β OLS n+1 as λ → 0. In contrast, the β PLS n+1 → β n+1|n as λ → ∞, and β PLS n+1 → β OLS n+1 as λ → 0. The point forecasts of y n+1 (x l ) obtained by the RR and PLS methods are given by Among these regression-based approaches, the PLS method can also update the interval forecasts.Let the one-step-ahead forecast errors of the principal component scores be given by { ξ 1,k , ξ 2,k , . . ., ξ n−K,k } can then be sampled with re- placement to give a bootstrap sample of β n+1|n,k : where ξ b * ,k denotes the bootstrap samples, and B is the number of bootstrap replications.Based on (3), the bootstrapped β b n+1|n leads to the bootstrapped β b,PLS n+1 , we obtain B replications of where b n+1 (x l ) is obtained by sampling with replacement from { 1 (x l ), 2 (x l ), . . ., n (x l )}.Hence, the 100(1 − α)% prediction interval for the updated forecasts are defined as α/2 and (1 − α/2) quantiles of y b,PLS n+1 (x l ).

Conclusion
This article described several techniques in the ftsa package, for modeling and forecasting functional time series.These methods centered on the FPCR, which is a common dimension reduction technique in the functional data analysis literature.FPCR reduces intrinsically infinite number of variables to several orthogonal regressors, which captures the main mode of variation in data.As illustrated by the Australian fertility rate data, FPCR is able to model and forecast annual Australian fertility rates through either multi-step-ahead forecasts or iterative one-step-ahead forecasts using the rolling origin approach.When the continuous variable in a functional time series is also a time variable, a new observation arrives sequentially.As shown using the monthly sea surface temperature data, the BM and PLS methods can update the point and interval forecasts based on the FPCR.
To sum up, the methods reviewed in this article focus on extracting patterns from a set of functional time series, and should be considered when the interest lies in modeling and forecasting the future realizations of a stochastic process.

Figure 1 :
Figure 1: The first two functional principal components and their associated principal component scores for the Australian fertility rate data from 1921 to 2006.

Figure 1
Figure1was produced by the following code.
Multiple-step-ahead forecasts.Based on the historical data from 1921 to 2006, we obtain 20step-ahead forecasts for 2007 to 2026.
birth (per 1,000 females) 2007 2026 (b) Iterative one-step-ahead forecasts.Based on the historical data from 1921 to 2006, we obtain iterative one-step-ahead forecasts for 2007 to 2026 using the rolling origin approach.

Figure 2 :
Figure 2: Forecasts of the Australian fertility rates from 2007 to 2026, based on the first two functional principal components and their associated principal component scores as an illustration.

Figure 3 :
Figure 3: Forecasts of fertility rates in 2007, along with the 80% prediction interval.

Figure 4 :Figure 5 :
Figure4: Dynamic update via the BM approach.The colored region shows the data loss in the first year.The forecasts for the remaining months in year n + 1 can be updated by the forecasts using the TS method applied to the upper block.As an illustration, suppose we observe the monthly sea surface temperature data from January 1950 to May 2011, we aim to update the point and in-

Figure 5
Figure 5 was created by the following code

Figure 6 :
Figure 6: Prediction interval of the sea surface temperature data between June 2011 and December 2011.By incorporating the sea surface temperature data between January 2011 and May 2011, the 80% prediction interval can be updated using the PLS method.