fanplot : An R Package for Visualising Sequential Distributions

Fan charts, first developed by the Bank of England in 1996, have become a standard method for visualising forecasts with uncertainty. Using shading fan charts focus the attention towards the whole distribution away from a single central measure. This article describes the basics of plotting fan charts using an R add-on package alongside some additional methods for displaying sequential distributions. Examples are based on distributions of both estimated parameters from a time series model and future values with uncertainty.


Introduction
Probabilities are notoriously difficult to communicate effectively to lay audiences (Spiegelhalter et al., 2011).Fan charts provide one such method to illustrate either forecasts or past results that are based on probabilistic distributions.Using shading fan charts focus the attention of the reader on the whole distribution away from a single central estimate.Visualising the distribution can aid in communicating the degree of underlying uncertainty in probabilistic forecasts to non-specialists, that might not have been apparent in basic plots and summary statistics.
Fan charts were first introduced by the Bank of England for their inflation forecasts in February 1996 (Britton et al., 1998).Since their initial development fan charts have become a standard method to display uncertainty of future economic indicators by many central banks (Julio, 2007).Their use has also spread to other fields such as climate science (McShane and Wyner, 2011) and demography (Gerland et al., 2014).
Fan charts can be created using various software.Within R, the vars package (Pfaff, 2008) has a fanchart function for forecasts of confidence regions.It is based solely on "varpred" class objects, i.e., on the predictions of Vector Autoregressive models fitted using other functions within the vars package.Similarly, the forecast package (Hyndman and Khandakar, 2008) produces fan charts for forecasts based on time series models from the "forecast" class.Julio (2009) provides VBA code in order to plot fan charts for quarterly GDP data in Excel.Alternatively one could use point and click methods in Excel to build customised fan charts based on stacked area charts. 1 Buchmann (2010)  provides MATLAB code to create fan charts for user supplied forecast distributions with a limited amount of control for the plotted display.
In any of the fore-mentioned options users are restricted in either their ability to effectively adapt the properties of fan charts or create plots based on alternative models, values or simulated data.The aim of this article is to illustrate R code in the fanplot package to create fan charts of different styles and from a range of input data.These are demonstrated on data from sequential Monte Carlo Markov chain (MCMC) simulated distributions of parameters in a stochastic volatility model and expert based forecasts for the Consumer Price Index (CPI) of the Bank of England.

Fan charts for sequential simulated distributions
The fanplot package can used to display any form of sequential distributions along a plots x-axis.To illustrate, we use posterior density distributions of the estimated volatility of daily returns (y t ) from the Pound/Dollar exchange rate from 02/10/1981 to 28/6/1985.As Meyer and Yu (2002) show, posterior distributions for the volatility process can be estimated in WinBUGS by fitting the stochastic volatility model; The latent volatilities θ t , which are unknown states in a state-space model terminology (Harvey, 1990), are assumed to follow a Markovian transition over time given by the state equations: with θ 0 ∼ N(µ, τ 2 ).A sample of the posterior distributions of θ t is contained in the th.mcmc object of the fanplot package.It consists of (1000) rows corresponding to MCMC simulations and (945) columns corresponding to time points t.Example code to replicate this object using R2OpenBUGS (Sturtz et al., 2005) is given in the help file for th.mcmc object.It is based on the BUGS model of Meyer and Yu (2002) replicated in the my1.R file of the fanplot package.Time ordered simulated distributions, such as th.mcmc, can be easily extracted from the sims.listelement of an R2OpenBUGS bugs object.
A fan chart of the evolution of the distribution of θ t in Figure 1 can be plotted using either the fan0 or fan function.The fan0 function, which we will first discuss, provides the simplest representation; library("fanplot") fan0(data = th.mcmc) The plotting function calculates the values of 100 equally spaced percentiles of each future distribution when the default data.type= "simulations" is set.This allows 50 fans to be plotted from the heat.colourscolour palette, providing darker shadings for the more probable percentiles.The axis limits are determined from the data argument.By default, the y-axis limits to 85 percent of the range of the MCMC distributions to reduce white space in the plot.
Similar plots of sequential distributions from alternative Bayesian models can be easily plotted using the fan0 or fan functions.The data argument accepts objects from a range of classes including "mcmc" which is typically used to handle BUGS or JAGS results via the read.codaor read.jagscommands.
The data in th.mcmc are based on trading day observations only.Irregular time series can be handled by passing a zoo time series object (Zeileis and Grothendieck, 2005) to the data argument.The trading days are given in the spvdx object of the tsbugs package (Abel et al., 2013).library("zoo") library("tsbugs") # create irregular multiple time series object th.mcmc2 <-zoo(th.mcmc,order.by=svpdx$date)# plot fan0(data = th.mcmc2,type = "interval", ln = c(0.5,0.8, 0.95), llab = TRUE, rcex = 0.6) Basing the fan chart on a zoo time series allows the x-axis in Figure 2  the probs argument corresponds to prediction intervals.Consequently, the default interval fan chart comprises of three different shades, running from the darkest for the 50th prediction interval to the lightest for the 95th prediction interval.Contour lines are are controlled by the ln argument, which is set to NULL by default for fan0.In Figure 2, changes in the default of ln overlays lines on the fan chart for the upper and lower bounds of the 50th, 80th and 95th prediction intervals.A further line is plotted along the median of θ t controlled by the med.ln argument and shown when type = "interval".Labels on the right hand side are by default added to correspond to the upper and lower bounds of each plotted line.The text size of the labels are controlled by the rcex argument.This is set to 0.6 to incorporate the labels without extending the limits of the x-axis.The left labels are added by setting llab = TRUE and take the same text properties as the right labels.Users can customize many properties of the fan chart shading, labels contour lines, labels and axis through a range of arguments.
Spaghetti plots are a method of viewing data to visualize possible values through a systems.They are commonly used on geographical data, such as meteorological forecasts (Sanyal et al., 2010) to show possible or realised paths, or over time, such in longitudinal data analysis (Hedeker and Gibbons, 2006).Spaghetti plots can also be used represent uncertainty shown by a range of possible future trajectories or past estimates.For example, using the th.mcmc2 object Figure 3 displays 20 random sets of θ t simulations plotted by setting the argument style = "spaghetti"; # transparent fan with visible lines fan0 (th.mcmc2, ln = c(5, 20, 50, 80, 95), alpha = 0, ln.col = "darkorange", llab = TRUE) # spaghetti lines fan(th.mcmc2,style="spaghetti", n.spag=20, alpha = 0.3) The initial fan chart is completely transparent from setting the transparency argument alpha = 0.In order for the percentile lines to be visible a non-transparent colour is used for the ln.col argument.Lines are plotted according the user defined ln argument to provide underlying uncertainty measures for the posterior probability distribution.The spaghetti lines, which are semi-transparent, are based on a random selection of simulations.They are superimposed on a fan chart using the fan function, which operates in much the similar way as fan0.The most important difference between the two is in the default setting of the add argument, which controls whether to create a new plot window for a fan chart or add it to an existing device.For the fan function, add is set to TRUE and hence its is more appropriately used to add a fan chart to an existing plotting device.The fan function also adds lines and labels on select contours by default as illustrated in the next section.

Bank of England Fan Charts
The Monetary Policy Committee (MPC) of the Bank of England produces fan charts of forecasts for Consumer Price Index (CPI) of inflation and Gross Domestic Product in their quarterly Inflation Reports.Alongside the fan charts, the Bank of England provides data, in the form of central location, uncertainty and skewness parameters of a split-normal distribution that underlie their fan charts. 2he probability density of the split-normal distribution is given by (Julio, 2007) 3 as, where µ represents the mode, and the two standard deviations σ 1 and σ 2 can be derived given the overall uncertainty parameter, σ and skewness parameters, γ, as; Functions for the probability density, cumulative distribution, quantiles and random generation for the split-normal distribution can be found in the fanplot package.
The boe data frame provides historical details on the forecasts of the MPC for CPI inflation between Q1 2004 to Q4 2013.The first column time0 refers to the base year of forecast, the second, time indexes future projections, whilst the remaining three columns provide values for the corresponding projected central location (µ), uncertainty (σ) and skew (γ) parameters:

> head(boe
Bank of England style fan charts vary from quarter to quarter but follow a similar theme throughout, which can be replicated in R using the fanplot package.The input data given to a fan function to plot a fan chart differs from the simulations of the previous section.Rather than many simulations from distributions in each time period we can pass a matrix object of time ordered values from the split-normal quantile function. 4As is the case for passing simulated values to the fan function, rows of the data object represent a set of user defined probabilities and columns represent a set of time points.For example, in the code below, a subset of the Bank of England future parameters of CPI published in Q1 2013 are first selected.Then a vector of probabilities related to the percentiles, that we ultimately would like to plot different shaded fans for, are created.Finally, in a for loop, the qsplitnorm function calculates the values for which the time-specific (i) split-normal distribution will be less than or equal to the probabilities of p.
The cpival object can be used to add a fan chart to an active R graphic device.In the code below, the area of Figure 4 is set up when plotting the past CPI data, contained in the time series object cpi.The xlim arguments are set to ensure space on the right hand side of the plotting area for the fan.Following as closely as possible the Bank of England style for plotting fan charts for Q1 20135 , the plotting area is set to near square, the background for future values is a gray colour, y-axis are plotted on the right hand side, a horizontal line are added for the CPI target and a vertical line for the two-year ahead point.
Box plots (Tukey, 1977) are commonly used as a simple descriptive statistics to visualise data through their quartiles.The boxplot function in R has many options including the display of multiple box plots based on sequential distributions.However, when data is based on a time series with multiple observations during a unit of time, such as quarterly data, fixing the location of the plot on the x-axis can be cumbersome.The fan function overcomes this problem when setting style = "boxplot".In Figure 6 the simulated future CPI data cpisim are passed to the data argument: # plot past data plot(cpi0, type = "l", xlim = c(y0-5, y0+3), ylim = c(-2, 7), lwd = 2) # box plots fan(cpisim, style = "boxplot", start = y0, frequency = 4, outline = FALSE) The fan function allows users to easily the locate sequential distributions of box plots on the x-axis using the start and frequency arguments.Additional arguments in the fan function are passed to boxplot.For example from the code above, outliers are suppressed by setting outline = FALSE.

Summary
The fanplot package allows users to easily visualise uncertainty based on either simulations from sequential distributions or values based on pre-calculated quartiles of distribution.Interactive visualisations of fan charts, as demonstrated in the net_elicit.R demo using the shiny package (RStudio and Inc., 2014), could potentially allow for an intuitive elicitation of experts forecasts6 .Data of various classes can be incorporated including regular time series ("mts"), irregular time series ("zoo") and simulations from, say, MCMC via "matrix", "data.frame"or "mcmc" type objects.The fanplot package also has a range of options to adjust colour shadings of fan charts, their lines and labels, and specify whether to display prediction intervals or percentiles of distributions.7

1Figure 1 :
Figure 1: Stochastic Volatility Fan Chart for the Pound-Dollar Exchange Rate Data.

Figure 2 :
Figure 2: Stochastic Volatility Fan Chart for the Pound-Dollar Exchange Rate Data with Prediction Intervals.

Figure 4 :
Figure 4: Fan chart, in the Bank of England style, for the MPC Q1 2013 forecast of the percentage increase in prices on year earlier.

Figure 5 :
Figure 5: Alternative fan chart for the MPC Q1 2013 CPI forecast.
Stochastic Volatility Spaghetti Plot for the Pound-Dollar Exchange Rate Data.