This article describes two R packages for probabilistic weather forecasting, ensembleBMA, which offers ensemble postprocessing via Bayesian model averaging (BMA), and ProbForecastGOP, which implements the geostatistical output perturbation (GOP) method. BMA forecasting models use mixture distributions, in which each component corresponds to an ensemble member, and the form of the component distribution depends on the weather parameter (temperature, quantitative precipitation or wind speed). The model parameters are estimated from training data. The GOP technique uses geostatistical methods to produce probabilistic forecasts of entire weather fields for temperature or pressure, based on a single numerical forecast on a spatial grid. Both packages include functions for evaluating predictive performance, in addition to model fitting and forecasting.
Over the past two decades, weather forecasting has experienced a paradigm shift towards probabilistic forecasts, which take the form of probability distributions over future weather quantities and events. Probabilistic forecasts allow for optimal decision making for many purposes, including air traffic control, ship routing, agriculture, electricity generation and weather-risk finance.
Up to the early 1990s, most weather forecasting was deterministic, meaning that only one “best” forecast was produced by a numerical model. The recent advent of ensemble prediction systems marks a radical change. An ensemble forecast consists of multiple numerical forecasts, each computed in a different way. Statistical postprocessing is then used to convert the ensemble into calibrated and sharp probabilistic forecasts (Gneiting and A. E. Raftery 2005).
The ensembleBMA package (Fraley, A. E. Raftery, T. Gneiting, and J. M. Sloughter 2010) offers statistical postprocessing of forecast ensembles via Bayesian model averaging (BMA). It provides functions for model fitting and forecasting with ensemble data that may include missing and/or exchangeable members. The modeling functions estimate BMA parameters from training data via the EM algorithm. Currently available options are normal mixture models (appropriate for temperature or pressure), mixtures of gamma distributions (appropriate for wind speed), and Bernoulli-gamma mixtures with a point mass at zero (appropriate for quantitative precipitation). Functions for verification that assess predictive performance are also available.
The BMA approach to the postprocessing of ensemble forecasts was introduced by Raftery, T. Gneiting, F. Balabdaoui, and M. Polakowski (2005) and has been developed in Berrocal, A. E. Raftery, and T. Gneiting (2007), Sloughter, A. E. Raftery, T. Gneiting, and C. Fraley (2007), Wilson, S. Beauregard, A. E. Raftery, and R. Verret (2007), Fraley, A. E. Raftery, and T. Gneiting (2010) and Sloughter, T. Gneiting, and A. E. Raftery (2010). Detail on verification procedures can be found in Gneiting and A. E. Raftery (2007) and Gneiting, F. Balabdaoui, and A. E. Raftery (2007).
"ensembleData"
objectsEnsemble forecasting data for weather typically includes some or all of the following information:
ensemble member forecasts
initial date
valid date
forecast hour (prediction horizon)
location (latitude, longitude, elevation)
station and network identification
The initial date is the day and time at which initial conditions are provided to the numerical weather prediction model, to run forward the partial differential equations that produce the members of the forecast ensemble. The forecast hour is the prediction horizon or time between initial and valid dates. The ensemble member forecasts then are valid for the hour and day that correspond to the forecast hour ahead of the initial date. In all the examples and illustrations in this article, the prediction horizon is 48 hours.
For use with the
ensembleBMA package,
data must be organized into an "ensembleData"
object that minimally
includes the ensemble member forecasts. For model fitting and
verification, the corresponding weather observations are also needed.
Several of the model fitting functions can produce forecasting models
over a sequence of dates, provided that the "ensembleData"
are for a
single prediction horizon. Attributes such as station and network
identification, and latitude and longitude, may be useful for plotting
and/or analysis but are not currently used in any of the modeling
functions. The "ensembleData"
object facilitates preservation of the
data as a unit for use in processing by the package functions.
Here we illustrate the creation of an "ensembleData"
object called
srftData
that corresponds to the srft
data set of surface
temperature (Berrocal, A. E. Raftery, and T. Gneiting 2007):
data(srft)
<- c("CMCG", "ETA", "GASP", "GFS",
members "JMA", "NGPS", "TCWB", "UKMO")
<-
srftData ensembleData(forecasts = srft[,members],
dates = srft$date,
observations = srft$obs,
latitude = srft$lat,
longitude = srft$lon,
forecastHour = 48)
The dates
specification in an "ensembleData"
object refers to the
valid dates for the forecasts.
Forecast ensembles may contain members that can be considered exchangeable (arising from the same generating mechanism, such as random perturbations of a given set of initial conditions) and for which the BMA parameters, such as weights and bias correction coefficients, should be the same. In ensembleBMA, exchangeability is specified by supplying a vector that represents the grouping of the ensemble members. The non-exchangeable groups consist of singleton members, while exchangeable members belong to the same group. See Fraley, A. E. Raftery, and T. Gneiting (2010) for a detailed discussion.
Functions that rely on the
chron package
(James and K. Hornik 2010) are provided for converting to and from Julian dates.
These functions check for proper format (YYYYMMDD
or YYYYMMDDHH
).
BMA generates full predictive probability density functions (PDFs) for future weather quantities. Examples of BMA predictive PDFs for temperature and precipitation are shown in Figure 1.
As an example, we fit a BMA normal mixture model for forecasts of
surface temperature valid January 31, 2004, using the srft
training
data. The "ensembleData"
object srftData
created in the previous
section is used to fit the predictive model, with a rolling training
period of 25 days, excluding the two most recent days because of the 48
hour prediction horizon.
One of several options is to use the function ensembleBMA
with the
valid date(s) of interest as input to obtain the associated BMA fit(s):
<-
srftFit ensembleBMA(srftData, dates = "2004013100",
model = "normal", trainingDays = 25)
When no dates are specified, a model fit is produced for each date for which there are sufficient training data for the desired rolling training period.
The BMA predictive PDFs can be plotted as follows, with Figure 1 showing an example:
plot(srftFit, srftData, dates = "2004013100")
This steps through each location on the given dates, plotting the corresponding BMA PDFs.
Alternatively, the modeling process for a single date can be separated
into two steps: first extracting the training data, and then fitting the
model directly using the fitBMA
function. See Fraley, A. E. Raftery, T. Gneiting, and J. M. Sloughter (2007) for an
example. A limitation of this approach is that date information is not
automatically retained.
Forecasting is often done on grids that cover an area of interest,
rather than at station locations. The dataset srftGrid
provides
ensemble forecasts of surface temperature initialized on January 29,
2004 and valid for January 31, 2004 at grid locations in the same region
as that of the srft
stations.
Quantiles of the BMA predictive PDFs at the grid locations can be
obtained with the function quantileForecast
:
<- quantileForecast(srftFit,
srftGridForc quantiles = c( .1, .5, .9)) srftGridData,
Here srftGridData
is an "ensembleData"
object created from
srftGrid
, and srftFit
provides a forecasting model for the
corresponding date.1 The forecast probability of temperatures below
freezing at the grid locations can be computed with the cdf
function,
which evaluates the BMA cumulative distribution function:
<- cdf(srftFit, srftGridData,
probFreeze date = "2004013100", value = 273.15)
In the srft
and srftGrid
datasets, temperature is recorded in
degrees Kelvin (K), so freezing occurs at 273.15 K.
These results can be displayed as image plots using the plotProbcast
function, as shown in Figure 2, in which darker
shades represent higher probabilities. The plots are made by binning
values onto a plotting grid, which is the default in plotProbcast
.
Loading the fields
(Furrer, D. Nychka, and S. Sain 2010) and maps
(Brownrigg and T. P. Minka 2011) packages enables display of the country and state
outlines, as well as a legend.
The prcpFit
dataset consists of the fitted BMA parameters for 48 hour
ahead forecasts of daily precipitation accumulation (in hundredths of an
inch) over the U.S. Pacific Northwest from December 12, 2002 through
March 31, 2005, as described by Sloughter, A. E. Raftery, T. Gneiting, and C. Fraley (2007). The fitted models are
Bernoulli-gamma mixtures with a point mass at zero that apply to the
cube root transformation of the ensemble forecasts and observed data. A
rolling training period of 30 days is used. The dataset used to obtain
prcpFit
is not included in the package on account of its size.
However, the corresponding "ensembleData"
object can be constructed in
the same way as illustrated for the surface temperature data, and the
modeling process also is analogous, except that the "gamma0"
model for
quantitative precipitation is used in lieu of the "normal"
model.
The prcpGrid
dataset contains gridded ensemble forecasts of daily
precipitation accumulation in the same region as that of prcpFit
,
initialized January 13, 2003 and valid January 15, 2003. The BMA median
and upper bound (90th percentile) forecasts can be obtained and plotted
as follows:
data(prcpFit)
<- quantileForecast(
prcpGridForc date = "20030115",
prcpFit, prcpGridData, q = c(0.5, 0.9))
Here prcpGridData
is an "ensembleData"
object created from the
prcpGrid
dataset. The 90th percentile plot is shown in Figure
3.
The probability of precipitation and the probability of precipitation above \(.25\) inches can be obtained as follows:
<- 1 - cdf(prcpFit, prcpGridData,
probPrecip date = "20030115", values = c(0, 25))
The plot for the BMA forecast probability of precipitation accumulation exceeding \(.25\) inches is also shown in Figure 3.
The ensembleBMA functions for verification can be used whenever observed weather conditions are available. Included are functions to compute verification rank and probability integral transform histograms, the mean absolute error, continuous ranked probability score, and Brier score.
In the previous section, we obtained a gridded BMA forecast of surface
temperature valid January 31, 2004 from the srft
data set. To obtain
forecasts at station locations, we apply the function quantileForecast
to the model fit srftFit
:
<- quantileForecast(srftFit,
srftForc quantiles = c( .1, .5, .9)) srftData,
The BMA quantile forecasts can be plotted together with the observed
weather conditions using the function plotProbcast
as shown in Figure
4.
Here the R core function loess
was used to interpolate from the
station locations to a grid for surface plotting. It is also possible to
request image or perspective plots, or contour plots.
The mean absolute error (MAE) and mean continuous ranked probability
score [CRPS; e.g., Gneiting and A. E. Raftery (2007)] can be computed with the
functions CRPS
and MAE
:
CRPS(srftFit, srftData)
# ensemble BMA
# 1.945544 1.490496
MAE(srftFit, srftData)
# ensemble BMA
# 2.164045 2.042603
The function MAE
computes the mean absolute difference between the
ensemble or BMA median forecast2 and the observation. The BMA CRPS is
obtained via Monte Carlo simulation and thus may vary slightly in
replications. Here we compute these measures from forecasts valid on a
single date; more typically, the CRPS and MAE would be computed from a
range of dates and the corresponding predictive models.
Brier scores (see e.g., Jolliffe and D. B. Stephenson, editors 2003; Gneiting and A. E. Raftery 2007)
for probability forecasts of the binary event of exceeding an arbitrary
precipitation threshold can be computed with the function brierScore
.
Calibration refers to the statistical consistency between the predictive distributions and the observations (Gneiting, F. Balabdaoui, and A. E. Raftery 2007). The verification rank histogram is used to assess calibration for an ensemble forecast, while the probability integral transform (PIT) histogram assesses calibration for predictive PDFs, such as the BMA forecast distributions.
The verification rank histogram plots the rank of each observation
relative to the combined set of the ensemble members and the
observation. Thus, it equals one plus the number of ensemble members
that are smaller than the observation. The histogram allows for the
visual assessment of the calibration of the ensemble forecast
(Hamill 2001). If the observation and the ensemble members are
exchangeable, all ranks are equally likely, and so deviations from
uniformity suggest departures from calibration. We illustrate this with
the srft
dataset, starting at January 30, 2004:
<- ensembleValidDates(srftData) >=
use "2004013000"
<- verifRank(
srftVerifRank ensembleForecasts(srftData[use,]),
ensembleVerifObs(srftData[use,]))
<- ensembleSize(srftData)
k
hist(srftVerifRank, breaks = 0:(k+1),
prob = TRUE, xaxt = "n", xlab = "",
main = "Verification Rank Histogram")
axis(1, at = seq(.5, to = k+.5, by = 1),
labels = 1:(k+1))
abline(h=1/(ensembleSize(srftData)+1), lty=2)
The resulting rank histogram composites ranks spatially and is shown in Figure 5. The U shape indicates a lack of calibration, in that the ensemble forecast is underdispersed.
The PIT is the value that the predictive cumulative distribution
function attains at the observation, and is a continuous analog of the
verification rank. The function pit
computes it. The PIT histogram
allows for the visual assessment of calibration and is interpreted in
the same way as the verification rank histogram. We illustrate this on
BMA forecasts of surface temperature obtained for the entire srft
data
set using a 25 day training period (forecasts begin on January 30, 2004
and end on February 28, 2004):
<- ensembleBMA(srftData,
srftFitALL trainingDays = 25)
<- pit(srftFitALL, srftData)
srftPIT
hist(srftPIT, breaks = (0:(k+1))/(k+1),
xlab="", xaxt="n", prob = TRUE,
main = "Probability Integral Transform")
axis(1, at = seq(0, to = 1, by = .2),
labels = (0:5)/5)
abline(h = 1, lty = 2)
The resulting PIT histogram is shown in Figure 5. It shows signs of negative bias, which is not surprising because it is based on only about a month of verifying data. We generally recommend computing the PIT histogram for longer periods, ideally at least a year, to avoid its being dominated by short-term and seasonal effects.
The
ProbForecastGOP
package (Berrocal, Y. Gel, A. E. Raftery, and T. Gneiting 2010) generates probabilistic forecasts of
entire weather fields using the geostatistical output perturbation (GOP)
method of Gel, A. E. Raftery, and T. Gneiting (2004). The package contains functions for the GOP method,
a wrapper function named ProbForecastGOP
, and a plotting utility named
plotfields
. More detailed information can be found in the PDF document
installed in the package directory at
ProbForecastGOP/docs/vignette.pdf
or in the help files.
The GOP method uses geostatistical methods to produce probabilistic forecasts of entire weather fields for temperature and pressure, based on a single numerical weather forecast on a spatial grid. The method involves three steps:
an initial step in which linear regression is used for bias correction, and the empirical variogram of the forecast errors is computed;
an estimation step in which the weighted least squares method is used to fit a parametric model to the empirical variogram; and
a forecasting step in which a statistical ensemble of weather field forecasts is generated, by simulating realizations of the error field from the fitted geostatistical model, and adding them to the bias-corrected numerical forecast field.
In the first step of the GOP method, the empirical variogram of the forecast errors is found. Variograms are used in spatial statistics to characterize variability in spatial data. The empirical variogram plots one-half the mean squared difference between paired observations against the distance separating the corresponding sites. Distances are usually binned into intervals, whose midpoints are used to represent classes.
In
ProbForecastGOP,
four functions compute empirical variograms. Two of them, avg.variog
and avg.variog.dir
, composite forecast errors over temporal
replications, while the other two, Emp.variog
and EmpDir.variog
,
average forecast errors temporally, and only then compute variograms.
Alternatively, one can use the wrapper function ProbForecastGOP
with
argument out = "VARIOG"
.
The second step in the GOP method consists of fitting a parametric model
to the empirical variogram of the forecast errors. This is done by the
Variog.fit
function using the weighted least squares approach.
Alternatively, the wrapper function ProbForecastGOP
with entry out
set equal to "FIT"
can be employed. Figure 6
illustrates the result for forecasts of surface temperature over the
Pacific Northwest in January through June 2000.
The parametric models implemented in Variog.fit
are the exponential,
spherical, Gaussian, generalized Cauchy and Whittle-Matérn, with further
detail available in the help files. The function linesmodel
computes
these parametric models.
The final step in the GOP method involves generating multiple realizations of the forecast weather field. Each realization provides a member of a statistical ensemble of weather field forecasts, and is obtained by simulating a sample path of the fitted error field, and adding it to the bias-adjusted numerical forecast field.
This is achieved by the function Field.sim
, or by the wrapper function
ProbForecastGOP
with the entry out
set equal to "SIM"
. Both
options depend on the GaussRF
function in the
RandomFields
package that simulates Gaussian random fields (Schlather 2011).
The output of the functions is both numerical and graphical, in that
they return members of the forecast field ensemble, quantile forecasts
at individual locations, and plots of the ensemble members and quantile
forecasts. The plots are created using the image.plot
, US
and
world
functions in the
fields package
(Furrer, D. Nychka, and S. Sain 2010). As an illustration, Figure 7 shows a
member of a forecast field ensemble of surface temperature over the
Pacific Northwest valid January 12, 2002.
We have described two packages, ensembleBMA and ProbForecastGOP, for probabilistic weather forecasting. Both packages provide functionality to fit forecasting models to training data.
In ensembleBMA, parametric mixture models, in which the components correspond to the members of an ensemble, are fit to a training set of ensemble forcasts, in a form that depends on the weather parameter. These models are then used to postprocess ensemble forecasts at station or grid locations.
In ProbForecastGOP, a parametric model is fit to the empirical variogram of the forecast errors from a single member, and forecasts are obtained by simulating realizations of the fitted error field and adding them to the bias-adjusted numerical forecast field. The resulting probabilistic forecast is for an entire weather field, and shows physically realistic spatial features.
Supplementing model fitting and forecasting, both packages provide functionality for verification, allowing the quality of the forecasts produced to be assessed.
This research was supported by the DoD Multidisciplinary University Research Initiative (MURI) program administered by the Office of Naval Research under Grant N00014-01-10745, by the National Science Foundation under grants ATM-0724721 and DMS-0706745, and by the Joint Ensemble Forecasting System (JEFS) under subcontract S06-47225 from the University Corporation for Atmospheric Research (UCAR). We are indebted to Cliff Mass and Jeff Baars of the University of Washington Department of Atmospheric Sciences for sharing insights, expertise and data, and thank Michael Scheuerer for comments. This paper also benefitted from the critiques of two anonymous reviewers and the associate editor.
ensembleBMA, ProbForecastGOP, chron, fields, maps, RandomFields
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Raftery, et al., "Probabilistic Weather Forecasting in R", The R Journal, 2011
BibTeX citation
@article{RJ-2011-009, author = {Raftery, Chris Fraley, Adrian and Gneiting, Tilmann and Sloughter, McLean and Berrocal, Veronica}, title = {Probabilistic Weather Forecasting in R}, journal = {The R Journal}, year = {2011}, note = {https://rjournal.github.io/}, volume = {3}, issue = {1}, issn = {2073-4859}, pages = {55-63} }