In this paper the tsfknn package for time series forecasting using
Time series forecasting has been performed traditionally using statistical methods such as ARIMA models or exponential smoothing (Hyndman and Athanasopoulos 2014). However, the last decades have witnessed the use of computational intelligence techniques to forecast time series. Although artificial neural networks is the most prominent machine learning technique used in time series forecasting (Zhang et al. 1998), other approaches, such as Gaussian Processes (Andrawis et al. 2011) or KNN (Martínez et al. 2017), have also been applied. Compared with classical statistical models, computational intelligence methods exhibit interesting features, such as their nonlinearity or the lack of an underlying model, that is, they are non-parametric.
Statistical methodologies for time series forecasting are present in R
as excellent packages. For example, the
forecast package
(Hyndman and Khandakar 2008) includes implementations of ARIMA, exponential
smoothing, the Theta method (Hyndman and Billah 2003) or basic techniques that can
be used as benchmark methods—such as the random walk approach. On the
other hand, although a great variety of computational intelligence
approaches for regression are available in R, such us the
caret package
(Kuhn 2008), these approaches cannot be directly applied to time series
forecasting. Fortunately, some new packages are filling this gap. For
example, the nnfor package
(Kourentzes 2017) or the nnetar
function from the forecast package
allow users to predict time series using artificial neural networks.
KNN is a very popular algorithm used in classification and regression
(Wu et al. 2007). This algorithm simply stores a collection of examples. In
regression, each example consists of a vector of features describing the
example and its associated numeric target value. Given a new example,
KNN finds its
The rest of the paper is organized as follows. Firstly, we explain how KNN regression can be applied in a time series forecasting context using the tsfknn package. Next, the different multi-step ahead strategies implemented in our package are explained. Some additional features of the package related to how the KNN model is specified are also discussed. The last sections explain how to assess the forecast accuracy of a model and compare the package with other R packages based on machine learning approaches.
In this section we first explain how KNN regression can be applied to forecast time series. Next, we describe how the tsfknn package can be used to forecast a time series.
As described above, KNN regression simply holds a collection of training
instances. The
The
In short, KNN stores a collection of training instances described by
Now, let us see how KNN can be applied to time series forecasting. In
this case, the target associated with a training instance is a
collection of values of the time series and the features describing the
instance are lagged values of the target—that is, we have an
autoregressive model. For example, let us start with a monthly time
series containing 132 observations, i.e., 11 years:
Features | Target |
---|---|
... | ... |
Let us see now how the tsfknn package can be used to forecast a time
series. In our package, you can consult the training examples associated
with a KNN model with the knn_examples
function:
> timeS <- window(nottem, end = c(1930, 12))
> pred <- knn_forecasting(timeS, h = 1, lags = 1:12, k = 2)
> head(knn_examples(pred))
Lag12 Lag11 Lag10 Lag9 Lag8 Lag7 Lag6 Lag5 Lag4 Lag3 Lag2 Lag1 H1
[1,] 40.6 40.8 44.4 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8 44.2
[2,] 40.8 44.4 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8 44.2 39.8
[3,] 44.4 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8 44.2 39.8 45.1
[4,] 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8 44.2 39.8 45.1 47.0
[5,] 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8 44.2 39.8 45.1 47.0 54.1
[6,] 58.5 57.7 56.4 54.3 50.5 42.9 39.8 44.2 39.8 45.1 47.0 54.1 58.7
Before consulting the training examples with knn_examples
, you have to
build the model. This is done with the function knn_forecasting
that
builds a model associated with a time series and uses the model to
predict the future values of the time series. Let us see the main
arguments of this function:
timeS
: the time series to be forecast.
h
: the forecasting horizon, that is, the number of future values to be predicted.
lags
: an integer vector indicating the lagged values of the target used as its describing features in the examples—for instance, 1:12 means that lagged values 1 to 12 should be used.
k
: the number of nearest neighbors used by the KNN model.
knn_forecasting
is very handy because, as commented above, it builds
the KNN model and then uses the model to predict the time series. This
function returns a "knnForecast"
object with information of the model
and its prediction. As we have seen above, you can use the function
knn_examples
to see the training examples associated with the model.
You can also consult the prediction or plot it through the
"knnForecast"
object:
> pred$prediction
Jan
1931 37.4
> plot(pred)
Figure 1 shows the result of plotting the prediction.
It is also possible to see how the prediction was made. That is, you can
consult the new instance whose target was predicted and its nearest
neighbors. This information is obtained with the nearest_neighbors
function applied to a "knnForecast"
object:
> nearest_neighbors(pred)
$`instance`
Lag 12 Lag 11 Lag 10 Lag 9 Lag 8 Lag 7 Lag 6 Lag 5 Lag 4 Lag 3 Lag 2 Lag 1
41.6 37.1 41.2 46.9 51.2 60.4 60.1 61.6 57.0 50.9 43.0 38.8
$nneighbors
Lag 12 Lag 11 Lag 10 Lag 9 Lag 8 Lag 7 Lag 6 Lag 5 Lag 4 Lag 3 Lag 2 Lag 1 H1
1 40.8 41.1 42.8 47.3 50.9 56.4 62.2 60.5 55.4 50.2 43.0 37.3 34.8
2 39.3 37.5 38.3 45.5 53.2 57.7 60.8 58.2 56.4 49.8 44.4 43.6 40.0
Because we have used lags 1:12 as features, the features associated with the next future value of the time series are the last twelve values of the time series. The targets of the two most similar examples or nearest neighbors are 34.8 and 40. Their average is the prediction: 37.4. A nice plot including the new instance, its nearest neighbors and the prediction can be obtained as follows:
> library(ggplot2)
> autoplot(pred, highlight = "neighbors", faceting = FALSE)
The result of executing this code snippet is shown in Figure 2. To recapitulate, in order to specify a KNN model you have to set:
In the previous section we have seen an example of one-step ahead prediction with KNN. Nonetheless, it is very common to forecast more than one value into the future. To this end, a multi-step ahead strategy has to be chosen (Ben Taieb et al. 2012). Our package implements two common strategies: the MIMO—Multiple Input Multiple Output—approach and the recursive or iterative approach. It must be noted that when only one future value is predicted both strategies are equivalent. In the next subsections these strategies are explained, together with examples of how they can be used in our package.
The Multiple Input Multiple Output strategy is commonly applied with KNN and it is characterized by the use of a vector of target values. The length of this vector is equal to the number of periods to be forecast. For example, let us assume that we are working with a time series giving the monthly totals of car drivers killed in Great Britain and we want to forecast the number of deaths for the next 12 months. In this situation, a good choice for the lags used as features would be 1:12, i.e., the totals of car drivers killed in the previous 12 months—an explanation about why lags 1:12 are a good choice is given in the section about default parameters. If the MIMO strategy is chosen, then a training example consists of:
The new instance would be the number of car drivers killed in the last 12 months of the time series. This way, we would look for the number of deaths most similar to the last 12 months in the time series and we would predict an aggregation of their subsequent 12 months. In the following example we predict the next 12 months using the MIMO strategy:
> timeS <- window(UKDriverDeaths, end = c(1979, 12))
> pred <- knn_forecasting(timeS, h = 12, lags = 1:12, k = 2, msas = "MIMO")
> autoplot(pred, highlight = "neighbors", faceting = FALSE)
The forecast for the next 12 months can be seen in Figure 3. The last 12 values of the time series are the features of the new instance whose target has to be predicted. The two sequences of 12 consecutive values most similar to this instance are found—in blue—and their subsequent 12 values—in green—are averaged to obtain the prediction—in red.
The recursive or iterative strategy is the approach used by ARIMA or exponential smoothing to forecast several periods. In this strategy a model that only forecasts one-step ahead is used. Therefore, the model is applied iteratively to forecast all the future periods. When historical observations to be used as features of the new instance are unavailable, previous predictions are used instead.
Because the recursive strategy uses a one-step ahead model, this means that, in the case of KNN, the target of a training example only contains one value. For instance, let us see how the recursive strategy works with the following example in which the next two future months of a monthly time series are predicted:
> pred <- knn_forecasting(USAccDeaths, h = 2, lags = 1:12, k = 2, msas = "recursive")
> autoplot(pred, highlight = "neighbors")
In this example we have used lags 1:12 to specify the features of an example. To predict the first future point the last 12 values of the time series are used as “its features”—see Figure 4. To predict the second future point "its features" are the last eleven values of the time series and the prediction for the first future point—see Figure 5.
In this section several additional features of our package related to model selection are described. In order to select a KNN model the following parameters have to be chosen:
In the following subsections some features related to setting the KNN parameters are explained.
Our package uses the Euclidean distance to find the nearest neighbors, although we can implement other distance metrics in the future.
Regarding the combination function, the targets of the cb
parameter of the knn_forecasting
function has to be used.
Next, we explain how the weighted combination is computed. The goal is
to give more weight to the closer neighbors. Let us denote as
This scheme fails when the distance to a training example is 0. In this special case, the target of this training example, whose features are identical to the new instance, is selected as the prediction.
In order to specify a KNN model the
A third strategy explored in (Martínez et al. 2017) is to use several KNN
models with different
> pred <- knn_forecasting(ldeaths, h = 12, lags = 1:12, k = c(2, 4))
> pred$prediction
Jan Feb Mar Apr May Jun Jul Aug Sep
1980 2865.375 2866.250 2728.875 2189.000 1816.000 1625.875 1526.250 1404.250 1354.000
Oct Nov Dec
1980 1541.250 1699.250 2198.750
In this example, two KNN models with
It also must be noted that if the targets are combined using weights,
then the choice of the
Sometimes a great number of time series have to be forecast. In that situation, an automatic way of generating the forecasts is very useful. In this sense, our package is able to use sensible default parameters. If the user only specifies the time series and the forecasting horizon the KNN parameters are selected as follows:
Once a model has been built, it is natural to want to assess its
forecast accuracy. In the tsfknn package this is done with the
rolling_origin
function. This function uses the classical approach of
dividing a dataset into training and test sets. In time series
forecasting the test set consists of the last observations of the time
series. For example:
> pred <- knn_forecasting(ldeaths, h = 12, lags = 1:12, k = 2)
> ro <- rolling_origin(pred, h = 6, rolling = FALSE)
As mentioned above, knn_forecasting
builds a KNN model and returns a
"knnForecast"
object with information about the model. The
rolling_origin
function takes a "knnForecast"
object as its first
parameter. From this, information about the time series and the
metaparameters of the KNN model is obtained; for example, the
autoregressive lags, the number of nearest neighbors or the multi-step
ahead strategy. The second parameter of rolling_origin
is the size of
the test set. In the example, the size is 6 and, therefore, the last 6
observations of the time series will be used as test set and the
remaining observations as training set. rolling_origin
returns a
"knnForecastRO"
object with information about the evaluation. For
example, the test set, predictions and errors can be consulted:
> print(ro$test_sets)
h=1 h=2 h=3 h=4 h=5 h=6
[1,] 1461 1354 1333 1492 1781 1915
> print(ro$predictions)
h=1 h=2 h=3 h=4 h=5 h=6
[1,] 1513.5 1363.5 1351.5 1567 1587.5 2392
> print(ro$errors)
h=1 h=2 h=3 h=4 h=5 h=6
[1,] -52.5 -9.5 -18.5 -75 193.5 -477
It is also possible to consult several forecasting accuracy measures about the predictions:
> ro$global_accu
RMSE MAE MAPE
213.613748 137.666667 7.747168
The forecasting accuracy measures are: root mean square error, mean absolute error and mean absolute percentage error. A plot of the time series and the forecasts can be obtained:
> plot(ro)
The result of this plot can be seen in Figure 8. In this figure the last six observations of the time series are the test set and the forecasts are the red points.
A more sophisticated version of training/test sets is to use a rolling
origin evaluation. The idea is as follows. The last
The function rolling_origin
uses rolling origin evaluation by default:
> pred <- knn_forecasting(ldeaths, h = 12, lags = 1:12, k = 2)
> ro <- rolling_origin(pred, h = 6)
> print(ro$test_sets)
h=1 h=2 h=3 h=4 h=5 h=6
[1,] 1461 1354 1333 1492 1781 1915
[2,] 1354 1333 1492 1781 1915 NA
[3,] 1333 1492 1781 1915 NA NA
[4,] 1492 1781 1915 NA NA NA
[5,] 1781 1915 NA NA NA NA
[6,] 1915 NA NA NA NA NA
> print(ro$predictions)
h=1 h=2 h=3 h=4 h=5 h=6
[1,] 1513.5 1363.5 1351.5 1567.0 1587.5 2392
[2,] 1363.5 1351.5 1567.0 1587.5 2392.0 NA
[3,] 1351.5 1567.0 1587.5 2392.0 NA NA
[4,] 1567.0 1587.5 2392.0 NA NA NA
[5,] 1587.5 2392.0 NA NA NA NA
[6,] 2392.0 NA NA NA NA NA
> print(ro$errors)
h=1 h=2 h=3 h=4 h=5 h=6
[1,] -52.5 -9.5 -18.5 -75.0 193.5 -477
[2,] -9.5 -18.5 -75.0 193.5 -477.0 NA
[3,] -18.5 -75.0 193.5 -477.0 NA NA
[4,] -75.0 193.5 -477.0 NA NA NA
[5,] 193.5 -477.0 NA NA NA NA
[6,] -477.0 NA NA NA NA NA
Each row in the output represents a different evaluation. Now, the
global_accu
field of the "knnForecastRO"
object stores measures of
the accuracy of all the forecasts in all the evaluations:
> ro$global_accu
RMSE MAE MAPE
274.19569 202.69048 11.09727
The forecast accuracy measures for the different prediction horizons can also be consulted:
> ro$h_accu
h=1 h=2 h=3 h=4 h=5 h=6
RMSE 213.613748 232.821283 260.25877 300.33107 363.98575 477.00000
MAE 137.666667 154.700000 191.00000 248.50000 335.25000 477.00000
MAPE 7.747168 8.577916 10.54699 13.60004 17.88665 24.90862
As expected, the errors grow with the increasing horizon. Finally, a plot of any of the different forecasts performed with the rolling origin evaluation can be obtained. For example:
> plot(ro, h = 2)
produces Figure 10.
In this section our package is compared with other software for univariate time series forecasting in R. In the CRAN Task Views: Time Series Analysis, Econometrics and Finance some models for time series forecasting can be found, from GARCH models to ensembles of models, such us the forecastHybrid package (Shaub and Ellis 2019). Most of the functions found in these packages use statistical models such as ARIMA or exponential smoothing. The forecast package is arguably the best package for time series forecasting. It implements all of the models that can be found in the stats package and many others such as the Theta method or multilayer perceptron. Furthermore, this package includes additional tools for plotting seasonal time series, doing Box-Cox transformations, etc.
The packages most closely related to our package are those that are based on machine learning approaches. Below, a brief description of these packages is given:
In this package the GMDH-type neural network algorithms are applied in order to perform short term forecasting for a univariate time series (Dag and Yozgatligil 2016).
This package contains a function, NNsetting
, that allows users to
create the examples needed to feed a neural network. However, the
package does not allow forecasts to be generated directly.
Allows users to predict a time series using a multi-layer perceptron
with one hidden layer computed using the nnet
function from the
nnet package.
Uses the neuralnet
package to build multi-layer perceptrons. It is also possible to use
extreme learning machines through the elm
function.
This package contains the nnetar
function used to forecast a time
series using a multi-layer perceptron with a hidden layer.
Table 2 shows a comparison of these packages in terms of the following features:
GMDH | tsDyn | nnfor | forecast | tsfknn | |
---|---|---|---|---|---|
Arbitrary lags | no | no | yes | no | yes |
Default parameters | yes | no | yes | yes | yes |
Plotting facilities | no | yes | yes | yes | yes |
Rolling origin evaluation | no | no | yes | yes | yes |
Prediction intervals | yes | no | no | yes | no |
Exogenous variables | no | no | yes | yes | no |
We have also conducted a comparison of the methods found in these
packages based on forecast accuracy and running time. For this purpose,
data from the NN3 forecasting competition (Crone et al. 2011) has been used.
In this competition 111 monthly time series of industry data were used.
The length of the series ranges from 52 to 120 observations and there is
also a balanced mix of seasonal and non-seasonal series. As in the NN3
competition, the next 18 future months of every time series have to be
predicted. The MAPE has been used to assess the forecast accuracy. Given
the forecast
Given a certain method, its MAPE is computed for the 111 time series and
averaged in order to obtain a global MAPE. This global MAPE appears in
the first row of Table 3 for the different
methods. In the comparison the package GMDH has not been included
because at most it allows users to forecast 5-steps ahead. elm
and
mlp
are functions from the nnfor package for computing extreme
learning machines and multi-layer perceptrons respectively. auto.arima
and ets
are functions belonging to the forecast package that
implement ARIMA and exponential smoothing. When calling the functions we
have specified as few parameters as possible, so that the function
selects automatically or by default the value of the parameters. The
statistical models have achieved the best results. Among the machine
learning approaches our package is the winner.
In the second row of Table 3 the time in seconds needed for fitting the model and generating the forecasts is shown. There are significant differences between the methods, with our package being one of the fastest methods.
tsDyn | elm |
mlp |
nnetar |
auto.arima |
ets |
tsfknn | |
---|---|---|---|---|---|---|---|
MAPE | 20.73 | 18.76 | 20.95 | 18.38 | 15.64 | 15.52 | 17.06 |
Time | 2 | 3332 | 690 | 15 | 421 | 105 | 4 |
In this section a succinct description of all the functions and methods in the tsfknn package is given. Most of the functions have already been described above. For those functions not mentioned above a brief example of use is given:
knn_forecasting
given a time series and some metaparameters this function builds a
KNN model for forecasting the time series. It also uses this model
to make a prediction of the future values of the time series.
Information about the model and the prediction is returned in a
"knnForecast"
object.
knn_examples
shows the examples associated with a KNN model.
nearest_neighbors
shows the new instance used in a prediction and its nearest neighbors.
plot and autoplot
plot a time series and its forecast.
rolling_origin
assesses the forecast accuracy of a KNN model.
print and summary
show information about a model and its prediction.
predict
generates new predictions for a given KNN model.
n_training_examples
indicates the number of examples that a KNN model would have for a given time series and some metaparameters.
Now, a quick example of how to use the functions not explained
previously in the paper is given. The methods print
and summary
produce the expected result, i.e., they show some information about the
model and its prediction:
> pred <- knn_forecasting(mdeaths, h = 3)
> print(pred)
Call: knn_forecasting(timeS = mdeaths, h = 3)
Multiple-Step Ahead Strategy: MIMO
K (number of nearest neighors): 3 models with 3, 5 and 7 neighbors repectively
Autoregressive lags: 1 2 3 4 5 6 7 8 9 10 11 12
Number of examples: 58
Targets are combined using the mean function.
> summary(pred)
Call: knn_forecasting(timeS = mdeaths, h = 3)
Multiple-Step Ahead Strategy: MIMO
K (number of nearest neighors): 3 models with 3, 5 and 7 neighbors repectively
Autoregressive lags: 1 2 3 4 5 6 7 8 9 10 11 12
Number of examples: 58
Targets are combined using the mean function.
Forecasting horizon: 3
Forecast:
Jan Feb Mar
1980 1990.562 2106.390 1999.143
The method predict
is used to generate new forecasts using a
previously fitted model:
> pred <- knn_forecasting(mdeaths, h = 3, k = 2, msas = "recursive")
> new_pred <- predict(pred, h = 12)
> print(new_pred$prediction)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
1980 2141.0 2052.0 1894.0 1477.0 1570.5 1216.5 1130.0 1045.5 991.5 1049.5
Nov Dec
1980 1144.5 1520.5
Finally, n_training_examples
is a handy function for knowing how many
training example would have a model:
> n_training_examples(mdeaths, h = 3, lags = 1:12, msas = "MIMO")
[1] 58
There is hardly any package in R for applying computational intelligence regression methods to time series forecasting. In this paper we have presented the tsfknn package that allows users to forecast a time series using KNN regression. The interface of the package is quite simple, allowing users to specify a KNN model and to predict a time series using the model. Furthermore, several plots can be generated illustrating how the prediction has been computed.
Funds: This work was partially supported by the project TIN2015-68854-R (FEDER Founds) of the Spanish Ministry of Economy and Competitiveness.
We would like to thank the anonymous reviewers whose comments helped improve and clarify this manuscript.
forecast, caret, nnfor, tsfknn, forecastHybrid, GMDH, NTS, tsDyn, nnet, neuralnet
Econometrics, Environmetrics, Finance, HighPerformanceComputing, MachineLearning, MissingData, TimeSeries
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Martínez, et al., "Time Series Forecasting with KNN in R: the tsfknn Package", The R Journal, 2019
BibTeX citation
@article{RJ-2019-004, author = {Martínez, Francisco and Frías, María P. and Charte, Francisco and Rivera, Antonio J.}, title = {Time Series Forecasting with KNN in R: the tsfknn Package}, journal = {The R Journal}, year = {2019}, note = {https://rjournal.github.io/}, volume = {11}, issue = {2}, issn = {2073-4859}, pages = {229-242} }