fnets: An R Package for Network Estimation and Forecasting via Factor-Adjusted VAR Modelling

The package fnets for the R language implements the suite of methodologies proposed by Barigozzi et al. (2022) for the network estimation and forecasting of high-dimensional time series under a factor-adjusted vector autoregressive model, which permits strong spatial and temporal correlations in the data. Additionally, we provide tools for visualising the networks underlying the time series data after adjusting for the presence of factors. The package also offers data-driven methods for selecting tuning parameters including the number of factors, vector autoregressive order and thresholds for estimating the edge sets of the networks of interest in time series analysis. We demonstrate various features of fnets on simulated datasets as well as real data on electricity prices.


Introduction
Vector autoregressive (VAR) models are popularly adopted for modelling time series datasets collected in many disciplines including economics (Koop, 2013), finance (Barigozzi and Brownlees, 2019), neuroscience (Kirch et al., 2015) and systems biology (Shojaie and Michailidis, 2010), to name a few.By fitting a VAR model to the data, we can infer dynamic interdependence between the variables and forecast future values.In particular, estimating the non-zero elements of the VAR parameter matrices recovers directed edges between the components of vector time series in a Granger causality network.Besides, by estimating the precision matrix (inverse of the covariance matrix) of the VAR innovations, we can define a network representing their contemporaneous dependencies by means of partial correlations.Finally, the inverse of the long-run covariance matrix of the data simultaneously captures lead-lag and contemporaneous co-movements of the variables.For further discussions on the network interpretation of VAR modelling, we refer to Dahlhaus (2000), Eichler (2007), Billio et al. (2012) and Barigozzi and Brownlees (2019).
Fitting VAR models to the data quickly becomes a high-dimensional problem as the number of parameters grows quadratically with the dimensionality of the data.There exists a mature literature on 1 -regularisation methods for estimating VAR models in high dimensions under suitable sparsity assumptions on the VAR parameters (Basu and Michailidis, 2015;Han et al., 2015;Kock and Callot, 2015;Medeiros and Mendes, 2016;Nicholson et al., 2020;Liu and Zhang, 2021).Consistency of such methods is derived under the assumption that the spectral density matrix of the data has bounded eigenvalues.However, in many applications, the datasets exhibit strong serial and cross-sectional correlations which leads to the violation of this assumption.As a motivating example, we introduce a dataset of node-specific prices in the PJM (Pennsylvania, New Jersey and Maryland) power pool area in the United States, see Energy price data for further details. Figure 1 demonstrates that the leading eigenvalue of the long-run covariance matrix (i.e.spectral density matrix at frequency 0) increases linearly as the dimension of the data increases, which implies the presence of latent common factors in the panel data (Forni et al., 2000).Additionally, the left panel of Figure 2 shows the inadequacy of fitting a VAR model to such data under the sparsity assumption via 1 -regularisation methods, unless the presence of strong correlations is accounted for by a factor-adjustment step as in the right panel.Barigozzi et al. (2022) propose the FNETS methodology for factor-adjusted VAR modelling of high-dimensional, second-order stationary time series.Under their proposed model, the data is decomposed into two latent components such that the factor-driven component accounts for pervasive leading, lagging or contemporaneous co-movements of the variables, while the remaining idiosyncratic dynamic dependence between the variables is modelled by a sparse VAR process.Then, FNETS provides tools for inferring the networks underlying the latent VAR process and forecasting.
In this paper, we present an R package named fnets which implements the FNETS methodology.It provides a range of user-friendly tools for estimating and visualising the networks representing the interconnectedness of time series variables, and for producing forecasts.In addition, fnets thoroughly addresses the problem of selecting tuning parameters ranging from the number of factors and the VAR order, to regularisation and thresholding parameters adopted for producing sparse and interpretable networks.As such, a simple call of the main routine of fnets requires the input data only, and it outputs an object of S3 class fnets which is supported by a plot method for network visualisation and a predict method for time series forecasting.
There exist several packages for fitting VAR models and their extensions to high-dimensional time

Granger causal network Granger causal network
Figure 2: Granger causal networks defined in (5) obtained from fitting a VAR(1) model to the energy price data analysed in Figure 1, without (left) and with (right) the factor adjustment step outlined in FNETS: Network estimation.Edge weights (proportional to the size of coefficient estimates) are visualised by the width of each edge, and the nodes are coloured according to their groupings, see Data example for further details.
series, see lsvar (Bai, 2021), sparsevar (Vazzoler, 2021), nets (Brownlees, 2020), mgm (Haslbeck and Waldorp, 2020), graphicalVAR (Epskamp et al., 2018), bigVAR (Nicholson et al., 2017), and bigtime (Wilms et al., 2021).There also exist R packages for time series factor modelling such as dfms (Krantz and Bagdziunas, 2023) and sparseDFM (Mosley et al., 2023), and FAVAR (Bernanke et al., 2005) for Bayesian inference of factor-augmented VAR models.The package fnets is clearly distinguished from, and complements, the above list by handling strong cross-sectional and serial correlations in the data via factor-adjustment step performed in frequency domain.In addition, the FNETS methodology operates under the most general approach to high-dimensional time series factor modelling termed the Generalised Dynamic Factor (GDFM), first proposed in Forni et al. (2000) and further investigated in Forni et al. (2015).Accordingly, fnets is the first R package to provide tools for high-dimensional panel data analysis under the GDFM, such as fast computation of spectral density and autocovariance matrices via the Fast Fourier Transform, but it is flexible enough to allow for more restrictive static factor models.While there exist some packages for network-based time series modelling (e.g.GNAR, Knight et al., 2020), we highlight that the goal of fnets is to learn the networks underlying a time series and does not require a network as an input.

FNETS methodology
In this section, we introduce the factor-adjusted VAR model and describe the FNETS methodology proposed in Barigozzi et al. (2022) for network estimation and forecasting of high-dimensional time series.We limit ourselves to describing the key steps of FNETS and refer to the above paper for its comprehensive treatment, both methodologically and theoretically.

Factor-adjusted VAR model
A zero-mean, p-variate process ξ t follows a VAR(d) model if it satisfies where A ∈ R p×p , 1 ≤ ≤ d, determine how future values of the series depend on their past.For the p-variate random vector ε t = (ε 1t , . . ., ε pt ) , we assume that ε it are independently and identically distributed (i.i.d.) for all i and t with IE(ε it ) = 0 and Var(ε it ) = 1.Then, the positive definite matrix Γ ∈ R p×p is the covariance matrix of the innovations Γ 1/2 ε t .
In the literature on factor modelling of high-dimensional time series, the factor-driven component exhibits strong cross-sectional and/or serial correlations by 'loading' finite-dimensional vectors of factors linearly.Among many time series factor models, the GDFM (Forni et al., 2000) provides the most general approach where the p-variate factor-driven component χ t admits the following representation for some fixed q, where L stands for the lag operator.The q-variate random vector u t contains the common factors which are loaded across the variables and time by the filter B(L) = ∑ ∞ =0 B L , and it is assumed that u jt are i.i.d. with IE(u jt ) = 0 and Var(u jt ) = 1.The model (2) reduces to a static factor model (Bai, 2003;Stock and Watson, 2002;Fan et al., 2013), when B(L) = ∑ s =0 B L for some finite integer s ≥ 0.Then, we can write χ t = ΛF t where F t = (u t , . . ., u t−s ) and Λ = [B 0 , . . ., B s ] (3) with r = q(s + 1) as the dimension of static factors F t .Throughout, we refer to the models (2) and (3) as unrestricted and restricted to highlight that the latter imposes more restrictions on the model.Barigozzi et al. (2022) propose a factor-adjusted VAR model under which we observe a zero-mean, second-order stationary process X t = (X 1t , . . ., X pt ) for t = 1, . . ., n, that permits a decomposition into the sum of the unobserved components ξ t and χ t , i.e.
We assume that IE(ε it u jt ) = 0 for all i, j, t and t as is commonly assumed in the literature, such that IE(ξ it χ i t ) = 0 for all 1 ≤ i, i ≤ p and t, t ∈ Z.

Networks
Under (4), it is of interest to infer three types of networks representing the interconnectedness of X t after factor adjustment.Let V = {1, . . ., p} denote the set of vertices representing the p crosssections.Then, the VAR parameter matrices, A = [A ,ii , 1 ≤ i, i ≤ p], encode the directed network N G = (V, E G ) representing Granger causal linkages, where the set of edges are given by Here, the presence of an edge (i, i ) ∈ E G indicates that ξ i ,t− Granger causes ξ it at some lag 1 ≤ ≤ d (Dahlhaus, 2000).
The second network contains undirected edges representing contemporaneous cross-sectional dependence in VAR innovations Γ 1/2 ε t , denoted by N C = (V, E C ).We have (i, i ) ∈ E C if and only if the partial correlation between the i-th and i -th elements of Γ 1/2 ε t is non-zero, which in turn is given by − et al., 2009).Hence, the set of edges for N C is given by Finally, we can summarise the aforementioned lead-lag and contemporaneous relations between the variables in a single, undirected network N L = (V, E L ) by means of the long-run partial correlations of ξ t .Let Ω = [ω ii , 1 ≤ i, i ≤ p] denote the inverse of the zero-frequency spectral density (a.k.a.long-run covariance) of ξ t , which is given by Ω = 2πA (1)∆A(1) with A(z) = I − ∑ d =1 A z .Then, the long-run partial correlation between the i-th and i -th elements of ξ t , is obtained as (Dahlhaus, 2000), so the edge set of N L is given by

FNETS: Network estimation
We describe the three-step methodology for estimating the networks N G , N C and N L .Throughout, we assume that the number of factors, either q under the more general model in (2) or r under the restricted model in (3), and the VAR order d are known, and discuss its selection in Tuning parameter selection.
Step 1: Factor adjustment The autocovariance (ACV) matrices of ξ t , denoted by Γ ξ ( ) = IE(ξ t− ξ t ) for ≥ 0 and Γ ξ ( ) = (Γ ξ (− )) for < 0, play a key role in network estimation.Since ξ t is not directly observed, we propose to adjust for the presence of the factor-driven χ t and estimate Γ ξ ( ).For this, we adopt a frequency domain-based approach and perform dynamic principal component analysis (PCA).Spectral density matrix Σ x (ω) of a time series {X t } t∈Z aggregates information of its ACV Γ x ( ), ∈ Z, at a specific frequency ω ∈ [−π, π], and is obtained by the Fourier transform Denoting the sample ACV matrix of X t at lag by X t− X t when ≥ 0 and Γ x ( ) = ( Γ x (− )) when < 0, we estimate the spectral density of X t by where K(•) denotes a kernel, m the kernel bandwidth (for its choice, see Tuning parameter selection) and ω k = 2πk/(2m + 1) the Fourier frequencies.We adopt the Bartlett kernel as K(•) which ensures positive semi-definiteness of Σ x (ω) and also Γ ξ ( ) estimating Γ ξ ( ) obtained as described below.
Performing PCA on Σ x (ω k ) at each ω k , we obtain the estimator of the spectral density matrix of χ t as Σ χ (ω k ) = ∑ q j=1 µ x,j (ω k ) e x,j (ω k )( e x,j (ω k )) * , where µ x,j (ω k ) denotes the j-th largest eigenvalue of Σ x (ω k ), e x,j (ω k ) its associated eigenvector, and for any vector a ∈ C n , we denote its transposed complex conjugate by a * .Then taking the inverse Fourier transform of Σ χ (ω k ), −m ≤ k ≤ m, leads to an estimator of Γ χ ( ), the ACV matrix of χ t , as Finally, we estimate the ACV of ξ t by When we assume the restricted factor model in (3), the factor-adjustment step is simplified as it suffices to perform PCA in the time domain, i.e. eigenanalysis of the sample covariance matrix Γ x (0).Denoting the eigenvector of Γ x (0) associated with its j-th largest eigenvalue by e x,j , we obtain Step 2: Estimation of N G Recall from (5) that N G representing Granger causal linkages, has its edge set determined by the VAR transition matrices A , 1 ≤ ≤ d.By the Yule-Walker equation, we have where We propose to estimate β as a regularised Yule-Walker estimator based on G(d) and g(d), each of which is obtained by replacing Γ ξ ( ) with Γ ξ ( ) (see ( 9)) in the definition of G(d) and g(d). For We consider two estimators of β.Firstly, we adopt a Lasso-type estimator which solves an 1 -regularised M-estimation problem with a tuning parameter λ > 0. In the implementation, we solve (11) via the fast iterative shrinkagethresholding algorithm (FISTA, Beck and Teboulle, 2009).Alternatively, we adopt a constrained 1 -minimisation approach closely related to the Dantzig selector (DS, Candes and Tao, 2007): for some tuning parameter λ > 0. We divide (12) into p sub-problems and obtain each column of β DS via the simplex algorithm (using the function lp in lpSolve).Barigozzi et al. (2022) establish the consistency of both β las and β DS but, as is typically the case for 1 -regularisation methods, they do not achieve exact recovery of the support of β.Hence we propose to estimate the edge set of N G by thresholding the elements of β with some threshold t > 0, where either β = β las or β = β DS , i.e.
We discuss cross validation and information criterion methods for selecting λ, and a data-driven choice of t, in Tuning parameter selection.
Step 3: Estimation of N C and N L From the definitions of N C and N L given in ( 6) and ( 7), their edge sets are obtained by estimating ∆ = Γ −1 and Ω = 2πA (1)∆A(1).Given β = [ A 1 , . . ., A d ] , some estimator of the VAR parameter matrices obtained as in either ( 11) or ( 12), a natural estimator of Γ arises from the Yule-Walker equation In high dimensions, it is not feasible or recommended to directly invert Γ to estimate ∆.Therefore, we adopt a constrained 1 -minimisation method motivated by the CLIME methodology of Cai et al. (2011).
Specifically, the CLIME estimator of ∆ is obtained by first solving and applying a symmetrisation step to ∆ = [ δii , 1 ≤ i, j ≤ p] as for some tuning parameter η > 0. Cai et al. (2016) propose ACLIME, which improves the CLIME estimator by selecting the parameter η in (15) adaptively.It first produces the estimators of the diagonal entries δ ii , 1 ≤ i ≤ p, as in ( 15) with η 1 = 2 log(p)/n as the tuning parameter.Then these estimates are used for adaptive tuning parameter selection in the second step.We provide the full description of the ACLIME estimator along with the details of its implementation in ACLIME estimator of the Appendix.
Given the estimators A(1) = I − ∑ d =1 A and ∆, we estimate Ω by Ω = 2π A (1) ∆ A(1).In Barigozzi et al. (2022), ∆ and Ω are shown to be consistent in ∞ -and 1 -norms under suitable sparsity assumptions.However, an additional thresholding step as in ( 13) is required to guarantee consistency in estimating the support of ∆ and Ω and consequently the edge sets of N C and N L .We discuss data-driven selection of these thresholds and η in Tuning parameter selection.

FNETS: Forecasting
Following the estimation procedure, FNETS performs forecasting by estimating the best linear predictor of X n+a given X t , t ≤ n, for a fixed integer a ≥ 1.This is achieved by separately producing the best linear predictors of χ n+a and ξ n+a as described below, and then combining them.

Forecasting the factor-driven component
For given a ≥ 0, the best linear predictor of χ n+a given X t , t ≤ n, under (2) is Forni et al. (2015) show that the model ( 2) admits a low-rank VAR representation with u t as the innovations under mild conditions, and Forni et al. (2017) propose the estimators of B and u t based on this representation which make use of the estimators of the ACV of χ t obtained as described in Step 1.Then, a natural estimator of χ n+a|n is for some truncation lag K.We refer to χ unr n+a|n as the unrestricted estimator of χ n+a|n as it is obtained without imposing any restrictions on the factor model (2).
When χ t admits the static representation in (3), we can show that , where M χ ∈ R r×r is a diagonal matrix with the r eigenvalues of Γ χ (0) on its diagonal and E χ ∈ R p×r the matrix of the corresponding eigenvectors; see Section 4.1 of Barigozzi et al. (2022) and also Forni et al. (2005).This suggests an estimator where M χ and E χ are obtained from the eigendecomposition of Γ χ (0).We refer to χ res n+a|n as the restricted estimator of χ n+a|n .As a by-product, we obtain the in-sample estimators of χ t , t ≤ n, as χ t|n = χ t , with either of the two estimators in ( 16) and ( 17).

Forecasting the latent VAR process
Once the VAR parameters are estimated either as in ( 11) or ( 12), we produce an estimator of ξ n+a|n = ∑ d =1 A ξ n+a− , the best linear predictor of ξ n+a given X t , t ≤ n, as Here, ξ n+1− = X n+1− − χ n+1− denotes the in-sample estimator of ξ n+1− , which may be obtained with either of the two (in-sample) estimators of the factor-driven component in ( 16) and ( 17).

Tuning parameter selection
Factor numbers q and r The estimation and forecasting tools of the FNETS methodology require the selection of the number of factors, i.e. q under the unrestricted factor model in (2), and r under the restricted, static factor model in (3).Under (2), there exists a large gap between the q leading eigenvalues of the spectral density matrix of X t and the remainder which diverges with p (see also Figure 1 We provide two methods for selecting the factor number q, which make use of the postulated eigengap using µ x,j (ω k ), 1 ≤ j ≤ p, the eigenvalues of the spectral density estimator of X t in (8) at a given Fourier frequency ω k , −m ≤ k ≤ m.Hallin and Liška (2007) propose an information criterion for selecting the number of factors under the model ( 2) and further, a methodology for tuning the multiplicative constant in the penalty.Define where pen(n, p) = min(p, m 2 , √ n/m) −1/2 by default (for other choices of the information criterion, see Appendix A) and c > 0 a constant.Provided that pen(n, p) → 0 sufficiently slowly, for an arbitrary value of c, the factor number q is consistently estimated by the minimiser of IC(b, c) over b ∈ {0, . . ., q}, with some fixed q as the maximum allowable number of factors.However, this is not the case in finite sample, and Hallin and Liška (2007) propose to simultaneously select q and c.First, we identify q(n l , p l , c) = arg min 0≤b≤ q IC(n l , p l , b, c) where IC(n l , p l , b, c) is constructed analogously to IC(b, c), except that it only involves the sub-sample {X it , 1 Then, denoting the sample variance of q(n l , p l , c), 1 ≤ l ≤ L, by S(c), we select q = q(n, p, c) with c corresponding to the second interval of stability with S(c) = 0 for the mapping c → S(c) as c increases from 0 to some c max (the first stable interval is where q is selected with a very small value of c). Figure 3 plots q(n, p, c) and S(c) for varying values of c obtained from a dataset simulated in Data generation.In the implementation of this methodology, we set n l = n − (L − l) n/20 and p l = 3p/4 + l p/40 with L = 10, and q = min(50, min(n − 1, p) ).

Sc
Figure 3: Plots of c against q(n, p, c) (in circle, y-axis on the left) and S(c) (in triangle, y-axis on the right) with the six IC (see Appendix A) implemented in the function factor.number of fnets, on a dataset simulated as in Data generation (with n = 500, p = 50 and q = 2).With the default choice of IC in (19) (IC 5 ), we obtain q = q(n, p, c) = 2 correctly estimating q = 2.
Alternatively, we can adopt the ratio-based estimator q = arg min 1≤b≤ q ER(b) proposed in Avarucci et al. (2022), where These methods are readily modified to select the number of factors r under the restricted factor model in (3), by replacing (2m + 1) −1 ∑ m k=−m µ x,j (ω k ) with µ x,j , the j-th largest eigenvalues of the sample covariance matrix Γ x (0).We refer to Bai and Ng (2002) and Alessi et al. (2010) for the discussion of the information criterion-based method in this setting, and Ahn and Horenstein (2013) for that of the eigenvalue ratio-based method.

Threshold t
Motivated by Liu et al. (2021), we propose a method for data-driven selection of the threshold t, which is applied to the estimators of A , 1 ≤ ≤ d, ∆ or Ω for estimating the edge sets of N G , N C or N L , respectively; see also (13).
Let B = [b ij ] ∈ R m×n denote a matrix for which a threshold is to be selected, i.e.B may be either β = [ A 1 , . . . ,A d ] , ∆ 0 ( ∆ with diagonals set to zero) or Ω 0 ( Ω with diagonals set to zero) obtained from Steps 2 and 3 of FNETS.We work with ∆ 0 and Ω 0 since we do not threshold the diagonal entries of ∆ and Ω.As such estimators have been shown to achieve consistency in ∞ -norm, we expect there exists a large gap between the entries of B corresponding to true positives and false positives.Further, it is expected that the number of edges reduces at a faster rate when increasing the threshold from 0 towards this (unknown) gap, compared to when increasing the threshold from the gap to |B| ∞ .Therefore, we propose to identify this gap by casting the problem as that of locating a single change point in the trend of the ratio of edges to non-edges, Here, sequence of candidate threshold values.We recommend using an exponentially growing sequence for {t k } M k=1 since the size of the false positive entries tends to be very small.The quantity N in the denominator of Ratio k is set as N = p 2 d when B = β, and N = p(p − 1) when we compute the cumulative sum (CUSUM) statistic

VAR order d, λ and η
Step 2 and Step 3 of the network estimation methodology of FNETS involve the selection of the tuning parameters λ and η (see ( 11), ( 12) and ( 14)) and the VAR order d.While there exist a variety of methods available for VAR order selection in fixed dimensions (Lütkepohl, 2005, Chapter 4), the data-driven selection of d in high dimensions remains largely unaddressed with a few exceptions (Nicholson et al., 2020;Krampe and Margaritella, 2021;Zheng, 2022).We suggest two methods for jointly selecting λ and d for Step 2. The first method is also applicable for selecting η in Step 3.

Cross validation
Cross validation (CV) methods have popularly been adopted for tuning parameter and model selection.
While some works exist which justify the usage of conventional CV procedure in time series setting in the absence of model mis-specification (Bergmeir et al., 2018), such arguments do not apply to our problem due to the latency of component time series.Instead, we propose to adopt a modified CV procedure that bears resemblance to out-of-sample evaluation or rolling forecasting validation (Wang and Tsay, 2021), for simultaneously selecting d and λ in Step 2. For this, the data is partitioned into  and d ≥ 1 is a pre-determined upper bound on the VAR order.A similar approach is taken for the selection of η with a Burg matrix divergence-based CV measure: Here, ∆ train l (η) denotes the estimator of ∆ with η as the tuning parameter from {X t , t ∈ I train l }, and Γ test l the estimator of Γ from {X t , t ∈ I test l }, see Step 3 for the descriptions of the estimators.In the numerical results reported in Simulations, the sample size is relatively small (ranging between n = 200 and n = 500 while p ∈ {50, 100, 200} and the number of parameters increasing with p 2 ), and we set L = 1 which returns reasonably good performance.When a larger number of observations are available relative to the dimensionality, we may use the number of folds greater than one.

Extended Bayesian information criterion
Alternatively, to select the pair (λ, d) in Step 2, we propose to use the extended Bayesian information criterion (eBIC) of Chen and Chen (2008), originally proposed for variable selection in high-dimensional linear regression.Let β(λ, b, t ada ) denote the thresholded version of β(λ, b) as in (13) with the threshold t ada chosen as described in Threshold t.Then, letting s(λ, b) = | β(λ, b, t ada )| 0 , we define , where Then, we select ( λ, d) = arg min λ∈Λ,1≤b≤ d eBIC α (λ, b).The constant α ∈ (0, 1) determines the degree of penalisation which may be chosen from the relationship between n and p.Preliminary simulations suggest that α = 0 is a suitable choice for the dimensions (n, p) considered in our numerical studies.

Other tuning parameters
Motivated by theoretical results reported in Barigozzi et al. (2022), we select the kernel bandwidth for Step 1 of FNETS as m = 4(n/ log(n)) 1/3 .In forecasting the factor-driven component as in ( 16), we set the truncation lag at K = 20, as it is expected that the elements of B decay rapidly as increases for short-memory processes.

Package overview
fnets is available from the Comprehensive R Archive Network (CRAN).The main function, fnets, implements the FNETS methodology for the input data and returns an object of S3 class fnets.fnets.varimplements Step 2 of the FNETS methodology estimating the VAR parameters only, and is applicable directly for VAR modelling of high-dimensional time series; its outputs are of class fnets.fnets.factor.modelperforms factor modelling under either of the two models (2) and (3), and returns an object of class fm.We provide predict methods for the objects of classes fnets and fm, and a plot method for the objects of the fnets class.We recommend that the input time series for the above functions are to be transformed to stationarity (if necessary) after a unit root test.In this section, we demonstrate how to use the functions included with the package.

Data generation
For illustration, we generate an example dataset of n = 500 and p = 50 following the model ( 4).fnets provides functions for this purpose.For given n and p, the function sim.var generates the VAR(1) process following (1) with d = 1, Γ as supplied to the function (Γ = I by default), and A 1 generated as described in Simulations.The function sim.unrestricted generates the factor-driven component under the unrestricted factor model in (2) with q dynamic factors (q = 2 by default) and the filter B(L) generated as in model (C1) of Simulations.
set.seed(111) n <-500 p <-50 x <-sim.var(n,p)$data + sim.unrestricted(n, p)$data Throughout this section, we use the thus-generated dataset in demonstrating fnets unless specified otherwise.There also exists sim.restricted which generates the factor-driven component under the restricted factor model in (3).For all data generation functions, the default is to use the standard normal distribution for generating u t and ε t , while supplying the argument heavy = TRUE, the innovations are generated from √ 3/5 • t 5 , the t-distribution with 5 degrees of freedom scaled to have unit variance.The package also comes attached with pre-generated datasets data.restrictedand data.unrestricted.

Calling fnets with default parameters
The function fnets can be called with the n × p data matrix x as the only input, which sets all other arguments to their default choices.Then, it performs the factor-adjustment under the unrestricted model in (2) with q estimated by minimising the IC in ( 19).The VAR parameter matrix is estimated via the Lasso estimator in (11) with d = 1 as the VAR order and the tuning parameters λ and η chosen via CV, and no thresholding is performed.This returns an object of class fnets whose entries are described in Table 1, and is supported by a print method as below.

Calling fnets with optional parameters
We can also specify the arguments of fnets to control how Steps 1-3 of FNETS are to be performed.The full model call is as follows: out <-fnets(x, center = TRUE, fm.restricted = FALSE, q = c("ic", "er"), ic.op = NULL, kern.bw= NULL, common.args= list(factor.var.order= NULL, max.var.order= NULL, trunc.lags= 20, n.perm = 10), var.order = 1, var.method = c("lasso", "ds"), var.args = list(n.iter= NULL, n.cores = min(parallel::detectCores() -1, 3)), do.threshold= FALSE, do.lrpc = TRUE, lrpc.adaptive= FALSE, tuning.args= list(tuning = c("cv", "bic"), n.folds = 1, penalty = NULL, path.length= 10) ) Here, we discuss a selection of input arguments.The center argument will de-mean the input.fm.restricted determines whether to perform the factor-adjustment under the restricted factor model in (3) or not.If the number of factors is known, we can specify q with a non-negative integer.Otherwise, it can be set as "ic" or "er" which selects the factor number estimator to be used between ( 19) and ( 20).When q = "ic", setting the argument ic.op as an integer between 1 and 6 specifies the choice of the IC (see Appendix A) where the default is ic.op = 5. kern.bwtakes a positive integer which specifies the bandwidth to be used in Step 1 of FNETS.The list common.argsspecifies arguments for estimating B and u t under (2), and relates to the low-rank VAR representation of χ t under the unrestricted factor model. var.order specifies a vector of positive integers to be considered in VAR order selection.var.method determines the method for VAR parameter estimation, which can be either "lasso" (for the estimator in (11)) or "ds" (for that in ( 12)).The list var.args takes additional parameters for Step 2 of FNETS, such as the number of gradient descent steps (n.iter, when var.method = "lasso") or the number of cores to use for parallel computing (n.cores, when var.method = "ds").do.thresholdselects whether to threshold the estimators of A , 1 ≤ ≤ d, ∆ and Ω.It is possible to perform Steps 1-2 of FNETS only without estimating ∆ and Ω by setting do.lrpc= FALSE.If do.lrpc = TRUE, lrpc.adaptivespecifies whether to use the non-adaptive estimator in ( 14) or the ACLIME estimator.The list tuning.argssupplies arguments to the CV or eBIC procedures, including the number of folds L (n.folds), the eBIC parameter α (penalty, see ( 21)) and the length of the grid of values for λ and/or η (path.length).Finally, it is possible to set only a subset of the arguments of common.args,var.args and tuning.argswhereby the unspecified arguments are set to their default values.
The factor adjustment (Step 1) and VAR parameter estimation (Step 2) functionalities can be accessed individually by calling fnets.factor.modeland fnets.var,respectively.The latter is equivalent to calling fnets with q = 0 and do.lrpc= FALSE.The former returns an object of class fm which contains the entries of the fnets object in Table 1 that relate to the factor-driven component only.

Network visualisation
Using the plot method available for the objects of class fnets, we can visualise the Granger network N G induced by the estimated VAR parameter matrices, see the left panel of Figure 5. plot(out, type = "granger", display = "network") With display = "network", it plots an igraph object from igraph (Csardi et al., 2006).Setting the argument type to "pc" or "lrpc", we can visualise N C given by the partial correlations of VAR innovations or N L given by the long-run partial correlations of ξ t .We can instead visualise the networks as a heat map, with the edge weights colour-coded by setting display = "heatmap".We plot N L as a heat map in the right panel of Figure 5 using the following command.

Granger causal network
Long-run partial correlation heatmap This produces a plot identical to the left panel of Figure 5 using the igraph object g.

Forecasting
The fnets objects are supported by the predict method with which we can forecast the input data n.ahead steps.For example, we can produce a one-step ahead forecast of X n+1 as pr <-predict(out, n.ahead = 1, fc.restricted = TRUE) pr$forecast The argument fc.restricted specifies whether to use the estimator χ res n+h|n in (17) generated under a restricted factor model (3), or χ unr n+h|n in ( 16) generated without such a restriction.Table 2 lists the entries from the output from predict.fnets.We can similarly produce forecasts from fnets objects output from fnets.var, or fm objects from fnets.factor.model.

Factor number estimation
It is of independent interest to estimate the number of factors (if any) in the input dataset.The function factor.numberprovides access to the two methods for selecting q described in Factor numbers q and r.
The following code calls the information criterion-based factor number estimation method in (19), and prints the output: Calling plot(fn) returns Figure 3 which visualises the factor number estimators from six information criteria implemented.Alternatively, we call the eigenvalue ratio-based method in (20) as fn <-factor.number(x,method = "er", fm.restricted = FALSE) In this case, plot(fn) produces a plot of ER(b) against the candidate factor number b ∈ {1, . . ., q}.

Visualisation of tuning parameter selection procedures
The method for threshold selection discussed in Threshold t is implemented by the threshold function, which returns objects of threshold class supported by print and plot methods.
th <-threshold(out$idio.var$beta) th Thresholded matrix Threshold: 0.0297308643 Non-zero entries: 62/2500 The call plot(th) generates Figure 4. Additionally, we provide tools for visualising the tuning parameter selection results adopted in Steps 2 and 3 of FNETS (see VAR order d, λ and η).These tools are accessible from both fnets and fnets.varby calling the plot method with the argument display = "tuning", e.g.
set.seed(111) n <-500 p <-10 x <-sim.var(n,p)$data out1 <-fnets(x, q = 0, var.order = 1:3, tuning.args= list(tuning = "cv")) plot(out1, display = "tuning") This generates the two plots reported in Figure 6 which visualise the CV errors computed as described in Cross validation and, in particular, the left plot shows that the VAR order is correctly selected by this approach.When tuning.argscontains tuning = "bic", the results from the eBIC method described in Extended Bayesian information criterion adopted in Step 2, is similarly visualised in place of the left panel of Figure 6.
Simulations Barigozzi et al. (2022) provide comprehensive simulation results on the estimation and forecasting performance of FNETS in comparison with competing methodologies.Therefore in this paper, we focus on assessing the performance of the methods for selecting tuning parameters such as the threshold and VAR order discussed in Tuning parameter selection.Additionally in Appendix B, we compare the adaptive and the non-adaptive estimators in estimating ∆ and also investigate how their performance is carried over to estimating Ω.

Settings
We consider the following data generating processes for the factor-driven component χ t : (C1) Taken from Forni et al. (2017), χ it is generated as a sum of AR processes denoting a uniform distribution.Then, χ t does not admit a static representation in (3).
(C2) χ t = 0, i.e. the VAR process is directly observed as X t = ξ t .For generating a VAR(d) process ξ t , we first generate a directed Erdős-Rényi random graph N = (V, E ) on V = {1, . . ., p} with the link probability 1/p, and set entries of A d such that A d,ii = 0.275 when (i, i ) ∈ E and A d,ii = 0 otherwise.Also, we set A = O for < d.The VAR innovations are generated as below.

(E2) Gaussian with the covariance matrix
For each setting, we generate 100 realisations.
Then we estimate Ω using the thresholded Lasso estimator of A 1 (see ( 11) and ( 13)) with two choices of thresholds, t = t ada generated as described in Threshold t and t = 0. To assess the performance of Ω = [ ω ii ] in recovering of the support of Ω = [ω ii ], i.e. {(i, i ) : ω ii = 0}, we plot the receiver operating characteristic (ROC) curves of true positive rate (TPR) against false positive rate (FPR), where Figure 7 plots the ROC curves averaged over 100 realisations when t = t ada and t = 0.When ∆ = I under (E1), we see little improvement from adopting t ada as the support recovery performance is already good even without thresholding.However, when ∆ = I under (E2), the adaptive threshold leads to improved support recovery especially when the sample size is large.Tables 3 and 4 in Appendix C additionally report the errors in estimating A 1 and Ω with and without thresholding, where we see little change is brought by thresholding.In summary, we conclude that the estimators already perform reasonably well without thresholding, and the adaptive threshold t ada brings marginal improvement in support recovery which is of interest in network estimation.

Results: VAR order selection
We compare the performance of the CV and eBIC methods proposed in VAR order d, λ and η for selecting the order of the VAR process.Here, we consider the case when χ t = 0 (setting (C2)) and when ξ t is generated under (E1) with d ∈ {1, 3}.We set (n, p) ∈ {(200, 10), (200,20), (500, 10), (500, 20)} where the range of p is in line with the simulation studies conducted in the relevant literature (see e.g.Zheng ( 2022)).We consider {1, 2, 3, 4} as the candidate VAR orders.Figure 8 and Table 5 in Appendix C show that CV works reasonably well regardless of d ∈ {1, 3}, with slightly better performance observed together with the DS estimator.On the other hand, eBIC tends to over-estimate the VAR order when d = 1 while under-estimating it when d = 3, and hence is less reliable compared to the CV method.

Energy price data
Electricity is more difficult to store than physical commodities which results in high volatility and seasonality in spot prices (Han et al., 2022).Global market deregulation has increased the volume of electricity trading, which promotes the development of better forecasting and risk management methods.We analyse a dataset of node-specific prices in the PJM (Pennsylvania, New Jersey and Maryland) power pool area in the United States, accessed using dataminer2.pjm.com.There are four node types in the panel, which are Zone, Aggregate, Hub and Extra High Voltage (EHV); for their definitions, see Table 8 and for the names and types of p = 50 nodes, see Table 9, all found in Appendix D. The series we model is the sum of the real time congestion price and marginal loss price or, equivalently, the difference between the spot price at a given location and the overall system price, where the latter can be thought of as an observed factor in the local spot price.These are obtained as hourly prices and then averaged over each day as per Maciejowska and Weron (2013).We remove any short-term seasonality by subtracting a separate mean for each day of the week.Since the energy prices may take negative values, we adopt the inverse hyperbolic sine transformation as in Uniejewski et al. (2017) for variance stabilisation.

Network estimation
We analyse the data collected between 01/01/2021 and 19/07/2021 (n = 200).The information criterion in ( 19) returns a single factor ( q = 1), and d = 1 is selected by CV.See Figure 9 for the heat maps visualising the three networks N G , N C and N L described in Networks, which are produced by fnets.The non-zero entries of the VAR parameter matrix estimates tend to take positive values, indicating that high energy prices are persistent and spill over to other nodes.Considering the node types, Hubtype nodes (blue) tend to have out-going edges to nodes of different types, which reflects the behaviour of the electrical transmission system.Some Zone-type nodes (red) have several in-coming edges from Aggregate-types (green) and Hub-types, while EHV-types (purple) have few edges in N G , which carries forward to N L where we observe that those Zone-type nodes have strong long-run correlations with other nodes while EHV-types do not.

Summary
We introduce the R package fnets which implements the FNETS methodology proposed by Barigozzi et al. (2022) for network estimation and forecasting of high-dimensional time series exhibiting strong correlations.It further implements data-driven methods for selecting tuning parameters, and provides tools for high-dimensional time series factor modelling under the GDFM which are of independent interest.The efficacy of our package is demonstrated on both real and simulated datasets.The constraints in ( 22) incorporate the parameter in the right-hand side.To use linear programming software to solve this, we formulate the constraints for each 1 ≤ i ≤ p as where Q i has entries q ii = η 1 ( γ ii ∨ γ i i ) in column i and 0 elsewhere.

Threshold selection
Tables 3 and 4 report the errors in estimating A 1 and Ω when the threshold t = t ada or t = 0 is applied to the estimator of A 1 obtained by either the Lasso (11) or the DS (12) estimators.With a matrix γ as an estimand we measure the estimation error of its estimator γ using the following (scaled) matrix norms:

VAR order selection
Table 5 reports the results of VAR order estimation over 100 realisations.

CLIME vs. ACLIME estimators
We compare the performance of the adaptive and non-adaptive estimators for the VAR innovation precision matrix ∆ and its impact on the estimation of Ω, the inverse of the long-run covariance matrix of the data (see Step 3).We generate χ t as in (C1), fix d = 1 and treat it as known and consider (n, p) ∈ {(200, 50), (200, 100), (500, 100), (500, 200)}.
In Tables 6 and 7, we report the errors of ∆ and Ω.We consider both the Lasso (11) and DS (12) estimators of VAR parameters, and CLIME and ACLIME estimators for ∆, which lead to four different estimators for ∆ and Ω, respectively.Overall, we observe that with increasing n, the performance of all estimators improve according to all metrics regardless of the scenarios (E1) or (E2), while increasing p has an adverse effect.The two methods perform similarly in setting (E1) when ∆ = I.There is marginal improvement for adopting the ACLIME estimator noticeable under (E2), particularly in TPR.Figures 10 and 11 shows the ROC curves for the support recovery of ∆ and Ω when the Lasso estimator is used.
Table 6: Errors in estimating ∆ using CLIME and ACLIME estimators, measured by L F and L 2 , averaged over 100 realisations (with standard errors reported in brackets).We also report the average TPR when FPR = 0.05 and the corresponding standard errors.

Figure 1 :
Figure 1: Box plots of the two largest eigenvalues (y-axis) of the long-run covariance matrix estimated from the energy price data collected between 01/01/2021 and 19/07/2021 (n = 200), see Data example for further details.Cross-sections of the data are randomly sampled 100 times for each given dimension p ∈ {2, . . ., 50} (x-axis) to produce the box plots.

Figure 4 :
Figure4: Ratio k (left) and CUSUM k (right) plotted against t k when B = β las obtained from the data simulated in Data generation with n = 500 and p = 50, as a Lasso estimator of the VAR parameter matrix, with the selected t ada denoted by the vertical lines.
and each fold is split into a training set I train l = {n • l + 1, . . ., (n • l + n • l+1 )/2 } and a test set I test l = I l \ I train l .On each fold, β is estimated from {X t , t ∈ I train l } as either the Lasso (11) or the Dantzig selector (12) estimators with λ as the tuning parameter and some b as the VAR order, say β train l (λ, b), using which we compute the CV ξ,l ( ), G test l (b) and g test l (b) are generated analogously as Γ ξ ( ), G(b) and g(b), respectively, from the test set {X t , t ∈ I test l }.Although we do not directly observe ξ t , the measure CV(λ, b) gives an approximation of the prediction error.Then, we select ( λ, d) = arg min λ∈Λ,1≤b≤ d CV(λ, b), where Λ is a grid of values for λ,

Figure 5 :
Figure 5: Estimated networks for data simulated as in Data generation.Left: Granger causal network N G .A directed arrow from node i to node i indicates that variable i Granger causes node i , and the edge weights proportional to the size of estimated coefficients are visualised by the edge width.Right: Long-run partial correlation network N L where the edge weights (i.e.partial correlations) are visualised by the colour.

Figure 8 :
Figure 8: Box plots of d − d over 100 realisations when the VAR order is selected by the CV and eBIC methods in combination with the Lasso (11) and the DS (12) estimators.

Figure 9 :
Figure 9: Heat maps of the three networks underlying the energy price data collected over the period 01/01/2021-19/07/2021.Left: N G obtained with the Lasso estimator (11) combined with the adaptive threshold t ada .Middle: N C obtained with the ACLIME estimator of ∆.Right: N L obtained by combining the estimators of VAR parameters and ∆.In the axis labels, Zone-type nodes are coloured in red, Aggregate-types in green, Hub-types in blue and EHV-types in purple.

Figure 10 :
Figure 10: ROC curves of TPR against FPR for ∆ with CLIME and ACLIME estimators in recovering the support of ∆, averaged over 100 realisations.Vertical lines indicate FPR = 0.05.

Table 1 :
Entries of S3 objects of class fnets Factor number integer spec Spectral density matrices for X t , χ t and ξ t (when fm.restricted = FALSE) list acv Autocovariance matrices for X t , χ t and ξ t list loadings Estimates of B , 0 ≤ ≤ K (when fm.restricted = FALSE) array or Λ (when fm.restricted = TRUE) factors Estimates of {u t } (when fm.restricted = FALSE) array or {F t } (when fm.restricted = TRUE) idio.varEstimates of A , 1 ≤ ≤ d, and Γ, and d and λ used list lrpc Estimates of ∆, Ω, (long-run) partial correlations and η used list mean.xSample mean vector vector var.methodEstimation method for A (input parameter) string do.lrpcWhether to estimate the long-run partial correlations (input parameter) Boolean kern.bwKernel bandwidth (when fm.restricted = FALSE, input parameter) double

Table 2 :
Entries of the output from predict.fnets matrix common.predictA list containing list $is n × p matrix containing the in-sample estimator of χ t $fc h × p matrix containing the h-step ahead forecasts of χ t $h Input parameter $r Factor number (only produced when fc.restricted = TRUE) idio.predictA list containing is, fc and h, see common.predictlist mean.xSample mean vector vector h × p matrix containing the h-step ahead forecasts of X t

Table 3 :
Errors in estimating A 1 with t ∈ {0, t ada } in combination with the Lasso (11) and the DS (12) estimators, measured by L F and L 2 , averaged over 100 realisations (with standard errors reported in brackets).We also report the average TPR when FPR = 0.05 and the corresponding standard error.See Results: Threshold selection in the main text for further information.

Table 4 :
Errors in estimating Ω with t ∈ {0, t ada } applied to the estimator of A 1 in combination with the Lasso (11) and the DS (12) estimators, measured by L F and L 2 , averaged over 100 realisations (with standard errors reported in brackets).We also report the average TPR when FPR = 0.05 and the corresponding standard error.See Results: Threshold selection in the main text for further information.

Table 5 :
Distribution of d − d over 100 realisations when the VAR order is selected by the CV and eBIC methods in combination with the Lasso (11) and the DS (12) estimators, see Results: VAR order selection in the main text for further information.

Table 7 :
Errors in estimating Ω using CLIME and ACLIME estimators of ∆, measured by L F and L 2 , averaged over 100 realisations (with standard errors reported in brackets).We also report the average TPR when FPR = 0.05 and the corresponding standard errors.