The causality is an important concept that is widely studied in the literature, and has several applications, especially when modelling dependencies within complex data, such as multivariate time series. In this article, we present a theoretical description of methods from the NlinTS package, and we focus on causality measures. The package contains the classical Granger causality test. To handle non-linear time series, we propose an extension of this test using an artificial neural network. The package includes an implementation of the Transfer entropy, which is also considered as a non-linear causality measure based on information theory. For discrete variables, we use the classical Shannon Transfer entropy, while for continuous variables, we adopt the k-nearest neighbors approach to estimate it.
The study of dependencies between variables is an important step in the analysis of multi-variate time series. Not surprisingly, it can be exploited in causal discovery for financial and neuroscience datasets, in feature selection to determine the most important variables as inputs of prediction models, etc. Standard measures like correlation and mutual information are very used for analyzing relationships between time series. Because these measures are symmetrical, they do not provide enough information concerning the transfer of information over time from one variable to another one. Therefore, in cases where we are interested in approximating non-symmetrical dependencies between variables, causality is more adequate than correlation measures.
In the literature, two main causality measures have been well investigated in the field of time series analysis; the Granger causality test (Granger 1980), and the Transfer entropy (Schreiber 2000). The Granger causality is based on the principle that a variable causes another variable if it contains useful information in terms of prediction. Consequently, it is mainly linked to the idea of using of a prediction model to test the causality. The Transfer entropy in the other hand is based on information theory and has gained an increasing attention during recent years. It measures the flow of information between variables using the conditional Shannon entropy. Although these two measures seem radically different, an interesting finding has been presented in (Barnett et al. 2009) showing that they are equivalent for variables that follow a normal distribution. In addition, Transfer entropy is considered as a non-linear alternative for the Granger causality, since it does not model the relationships between variables using a statistical model, instead, it is based on information theory.
This article covers a theoretical description of methods implemented in the NlinTS package (Hmamouche 2020). Particularly, we focus on methods and models that are related to causality measures. This package includes the Granger causality test. To deal with non-linear dependencies between time series, we propose an non-linear extension of the Granger causality test using feed-forward neural networks. The package includes also an implementation of Transfer entropy. Two versions are provided, one for discrete variables, and the second is an estimate for continuous variables based on the k-nearest neighbors approach (Kraskov et al. 2004). Therefore, We detail the Granger causality test, the proposed non-linear Granger causality test, the VARNN (Vector Auto-Regressive Neural Network) model, since it is used in the later. Then, we represent the Transfer entropy, including the original formulation and the continuous estimation, starting by the estimate of the entropy and the mutual information, because they will be useful to understand the Transfer entropy estimator.
It is worth to mention that there are several R packages that contain an implementation of the Granger causality test, such as vars (Pfaff 2008), lmtest (Zeileis and Hothorn 2002). However, for Transfer entropy, especially for the continuous estimation, we found only the RTransferEntropy package (Simon et al. 2019). The approach used for estimating the Transfer entropy for continuous variables is based on discretization methods, by transforming continuous variables to discrete, then, applying Shannon Transfert entropy. In this paper, our approach is based on the same principle proposed in Kraskov et al. (2004) to estimate the mutual information, which inherits from the Kozachenko-Leonenko estimator of the Shannon entropy.
The organization of the paper is as follows, the two first sections are for the theoretical formulation of the causality tests and the Transfer entropy measures. The third section provides R code examples of the presented measures, illustrating the usage of the implemented methods. Finally, the last section summarizes this paper.
The Granger causality test (Granger 1980) is the classical method to
test the causality between time series. To test if a variable
Formally, two VAR (p) (Vector Auto-Regressive) models are considered.
The first one uses the precedent values of
To quantify the causality, we have to evaluate the variances of the
errors of Model
where
Using artificial neural networks (ANNs) may be very important when computing causalities, especially for time series that change non-linearly over time. We take advantage from the characteristics of ANNs and propose an implementation of an extended version of the Granger causality test using the VARNN model. Before describing the proposed causality test, let us first present briefly the VARNN model which is also available in the package as a prediction model.
Consider a training dataset that consists of a multivariate time series
containing one target variable
Consider two variables
where
The difference compared to the classical test, is that instead of using
Let us emphasize an important point about this causality. It is evident that computing causalities using ANNs may has the classical drawback of increasing the computational time. This is not exactly precise in some cases, because suppose that we have a large number of time series and we have to compute causalities between all variables. Also, suppose that relationships between variables change over time. Therefore, this implies that we need to recalculate the causalities periodically or after each change. In addition, the basic formulations of the classical causality measures (Granger causality test and Transfer entropy) are not adaptive, which means they do not make it possible to update the new values by using the old ones. In the other hand, with ANNs, the first computation of causalities may be slow compared to the Granger test or the Entropy Transfer, but if we have new observations in the time series, the model adapts more quickly thanks to the learning properties of ANNs.
Transfer entropy (Schreiber 2000) between two time series
where
where
In this section, we describe the estimation of Transfer entropy based on the k-nearest neighbors. First, we show the entropy estimator represented in Kraskov et al. (2004). Then, we show the mutual information estimator that is based on an extended formulation based on the same principal. Then, we use this approach to estimate the Transfer entropy.
The basic approach for estimating the entropy of continuous variables is
based on binning the data, in order to get back to the classical
definition of Shannon entropy. However, more efficient approaches are
proposed by estimating directly the continuous entropy:
where
The main point of the Kozachenko-Leonenko estimator to approximate
where
The mutual information between two variables
It is possible to adopt the Kozachenko-Leonenko approach to estimate the
mutual information. In this case, we need to estimate the individual
entropy of each variable and the joint entropy. For the joint entropy,
it can be computed using the same way by considering the joint space
spanned by
In Kraskov et al. (2004), two new methods have been proposed to improve
the Kozachenko-Leonenko estimator for mutual information. The first
method is based on the idea that when estimating
where
Following the same method and generalizing the previous formulation to
where
The motivation behind the second estimator of mutual information
presented in Kraskov et al. (2004) is that the Kozachenko-Leonenko
estimation of the joint entropy (
where
Let us use the first strategy used by Kraskov for mutual information
estimation to estimate the Transfer entropy. Let
Consider the following notations :
We can rewrite then Equation (8) as follows:
The maximum joint space is defined by
where
By simplifying this expression, the Transfer entropy estimator can be
expressed as follows:
And this is the classical Transfer entropy estimator investigated and discussed in Vicente et al. (2011; Lizier 2014; Zhu et al. 2015).
The values obtained by the Transfer entropy (TE) are not normalized, and
practically, it is hard to quantify the causality in this case.
Normalizing the values of TE between
In Gourévitch and Eggermont (2007), a preparation step is added to compute NTE to
consider data that contain noise. It consists of subtracting first the
average of TE by shuffling the variable
In the package, we implemented just the first normalization (cf.
Equation (10)), because the second one depends on the way of
shuffling the variable
Concerning continuous Transfer entropy, the term
where
In this section, we demonstrate worked examples about the usage of the methods implemented in the package and discussed theoretically in the two previous sections. We use financial time series from the package timeSeries (Wuertz et al. 2017). We will present the classical Granger causality test, the VARNN prediction model, and the proposed non-linear Granger causality test. These functionalities are provided via Rcpp modules. We present also the functions associated to Transfer entropy measures, including the discrete and continuous estimate. Since other entropy measures are implemented, we will present them as well, such as the entropy and the mutual information.
The causality.test
module is based on an Rcpp module. The two first
arguments of the constructor of this module are two numerical vectors,
(the goal is to test if the second vector causes the first one). The
third argument is the lag parameter, which is an integer value. The last
argument is logical (false by default) for the option of making data
stationary using the Augmented Dickey-Fuller test, before performing the
causality test.
library (timeSeries)
library (NlinTS)
data = LPP2005REC
# Construct the causality model from the second column to the first one,
# with a lag equal to 2, and without taking into account stationarity
model = causality.test (data[,1], data[,2], 2, FALSE)
The causality.test
module has a summary
method to show all the
results of the test, and gci
(cf. 2), the statistic of the test (Ftest
), and
the p-value (the probability of non causality) of the test (pvalue
).
# Compute the causality index, the Ftest, and the pvalue of the test
model$summary ()
model$gci
model$Ftest
model$pvalue
The varmlp
module represents the implementation of the VARNN model. It
is an Rcpp module, where the constructor takes as arguments a
numerical Dataframe. Each column represents a variable, and the first
column is the target variable. Note that the Dataframe may contain one
column. In this case, the model will be univariate (ARNN model). The
second argument is the lag parameter, then, a numerical vector
representing the size of the hidden layers of the network, then, an
integer argument for the number of iterations to train the model. Other
arguments with default values are available about using the bias neuron,
the activation functions to use in each layer, the learning rate, and
the optimization algorithm. More details about these arguments can be
found in the manual of the package (Hmamouche 2020).
library (timeSeries)
library (NlinTS)
# Load the data
data = LPP2005REC
# The lag parameter
lag = 1
# The training set
train_data = data[1:(nrow (data) - 1), ]
# Build and train the model
model = varmlp (train_data, 1, c(10,5), 100)
The varmlp
module has forecast
compute
predictions from an input dataframe
, in other words, to test the
model. And a method train
update the parameters of the model from new
data.
# Predict the last row of the data
predictions = model$forecast (train_data)
# Show the predictions
print (predictions[nrow (predictions),])
# Update the model (two observations are required at least since lag = 1)
model$train (data[nrow (data) - lag: nrow (data)])
Similarly to the previous test, the nlin_causality.test
is an Rcpp
module. The two first arguments of the constructor of this module are
two numerical vectors, (the goal is to test if the second causes the
first). The third argument is the lag parameter. The next two arguments
are two numerical vectors representing the size of the hidden layers
used in models varmlp
model, other arguments with default values are available about
the bias neuron, the activation functions, the learning rate, and the
optimization algorithm. The manual of the package contain more details
concerning these arguments (Hmamouche 2020). The following is an example of
using the non-linear causality test:
library (timeSeries)
library (NlinTS)
data = LPP2005REC
# Build and train the model
model = nlin_causality.test (data[,1], data[,2], 2, c(2), c(4))
The nlin_causality.test
module returns the same values as the
causality.test
; a summary
method to show all the results of the
test, and gci
), the
statistic of the test (Ftest
), and the p-value of the test (pvalue
).
# Compute the causality index, the Ftest, and the pvalue of the test
model$summary ()
model$gci
model$Ftest
model$pvalue
The function entropy_disc
permits to compute the Shannon entropy,
where the first argument is a discrete vector, and the second argument
is the logarithm function to use (
library (NlinTS)
# The entropy of an integer vector
print (entropy_disc (c(3,2,4,4,3)))
The function entropy_disc
permits to compute the continuous estimation
of Shannon entropy, where the first argument is a numerical vector, and
the second argument is the number of neighbors (see
4.1.0.1):
library (timeSeries)
library (NlinTS)
# Load data
data = LPP2005REC
# The entropy of the first column with k = 3
print (entropy_cont (data[,1], 3))
The function mi_disc
permits to compute the Shannon multivariate
mutual information, where the first argument is an integer dataframe,
and the second argument is the logarithm function to use (
library (NlinTS)
# Construct an integer dataframe with 2 columns
df = data.frame (c(3,2,4,4,3), c(1,4,4,3,3))
# The mutual information between columns of df
mi = mi_disc (df)
print (mi)
The function mi_cont
permits to compute the continuous estimate of the
mutual information between two variables. The two first arguments are
two vectors, and the third argument is the number of neighbors (see
4.1.0.2):
library (timeSeries)
library (NlinTS)
# Load data
data = LPP2005REC
# The mutual information between of the two first columns of the data with k = 3
print (mi_cont (data[,1], data[,2], 3))
The function associated to the discrete TE is named te_disc
. The two
first arguments are two integer vectors. Here we allow the two time
series to have different lag parameters. Therefore, the second two
arguments are the lag parameters associated to the first and the second
arguments resp. The next argument indicates the logarithm function to
use (te_disc
function returns the value of Transfer entropy from the second
variable to the first variable:
library (NlinTS)
# The transfer entropy between two integer vectors with lag = 1 to 1
te = te_disc (c(3,2,4,4,3), c(1,4,4,3,3), 1, 1)
print (te)
The associated function is named te_cont
. The two first arguments are
two vectors. Then, the second two arguments are the associated lag
parameters for the first and the second arguments resp. The fifth
argument is the number of neighbors. The last argument is logical for
the option of normalizing the value of TE, with a false value by
default. The te_cont
function returns the value of Transfer entropy
from the second variable to the first one:
library (timeSeries)
library (NlinTS)
# Load data
data = LPP2005REC
# The transfer entropy between two columns with lag = 1 and k = 3
te = te_cont (data[,1], data[,2], 1, 1, 3)
print (te)
In this paper, we have presented methods of our NlinTS package for computing causalities in time series. We have considered two main measures well studied in the literature, the Granger causality test and the Transfer entropy. The Transfer entropy is originally formulated for discrete variables. For continuous variables, we adopted a k-nearest neighbors estimation based on the same strategy used to estimate the Mutual Information in Kraskov et al. (2004). To deal with non-linear time series, we have proposed another causality measure as an extension of the Granger causality test using an artificial neural network. Finally, we showed examples for the usage of these methods.
NlinTS, vars, lmtest, RTransferEntropy, timeSeries, Rcpp
Econometrics, Finance, HighPerformanceComputing, MissingData, NumericalMathematics, TimeSeries
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Hmamouche, "NlinTS: An R Package For Causality Detection in Time Series", The R Journal, 2020
BibTeX citation
@article{RJ-2020-016, author = {Hmamouche, Youssef}, title = {NlinTS: An R Package For Causality Detection in Time Series}, journal = {The R Journal}, year = {2020}, note = {https://rjournal.github.io/}, volume = {12}, issue = {1}, issn = {2073-4859}, pages = {21-31} }