Stochastic differential equations (SDEs) are useful to model continuous stochastic processes. When (independent) repeated temporal data are available, variability between the trajectories can be modeled by introducing random effects in the drift of the SDEs. These models are useful to analyze neuronal data, crack length data, pharmacokinetics, financial data, to cite some applications among other. The R
package focuses on the estimation of SDEs with linear random effects in the drift. The goal is to estimate the common density of the random effects from repeated discrete observations of the SDE. The package mixedsde proposes three estimation methods: a Bayesian parametric, a frequentist parametric and a frequentist nonparametric method. The three procedures are described as well as the main functions of the package. Illustrations are presented on simulated and real data.
Continuous stochastic processes are usually observed discretely in time (with equidistant time points or not) leading to times series, although their intrinsic nature is of continuous time. While discrete time stochastic models such as auto-regressive models (ARMA, GARCH, ...) have been widely developed for time series with equidistant times, more and more attention have been focused on Stochastic Differential Equations (SDEs). Examples of applications where SDEs have been used include dynamics of thermal systems (Bacher and Madsen 2011), solar and wind power forecasting (Iversen et al. 2014), neuronal dynamics (Ditlevsen and Samson 2014), pharmacokinetic/pharmacodynamic (PK/PD) (Hansen et al. 2014), crack growth (Hermann et al. 2016). Estimation for SDE is available in different softwares. We can cite among others the computer software CTSM with a (extended) Kalman filter approach (Kristensen and Madsen 2003), the sde package which proposes several tools for the simulation and the estimation of a variety of SDEs, and more recently the R-packages Sim.DiffProc (Guidoum and Boukhetala 2017) and yuima (Iacus 2018) (the last one proposes also some tools for quantitative finance).
Depending on the applications, independent repeated temporal measures might be available. For examples, drug concentration of several subjects is usually observed in PK; dynamics of several neurons is measured along time; time to crack lengths can be measured repeatedly in crack growth study. Each trajectory represents the behavior of a unit/subject. The functional form is similar for all the trajectories. Fitting the overall data simultaneously obviously improves the quality of estimation, but one has to take into account these variabilities between experiments. This is the typical framework of mixed-effects models where some parameters are considered as random variables (random effects) and proper to each trajectory. Hence the random effects represent the particularity of each subject. Some parameters can also be considered as common to all the trajectories (fixed effects).
In this work the model of interest is thus a mixed-effects stochastic
differential equation (MSDE), mixed-effects for both fixed and random
effects. The mixedsde package has been developed to estimate the
density of the random effects from the discrete observations of
More precisely, we focus on MSDE with linear drift. We consider
In the package, we restrict the models to the two famous SDEs with
linear drift, namely the Ornstein-Uhlenbeck model (OU) with
The random parameters are denoted
The
Context: To the best of our knowledge, this is the first package in
R
language dedicated to the estimation of MSDE. The main software
considering mixed models is (MONOLIX 2003) but methods for mixed stochastic
differential equations are not implemented for R
. One package named
PSM (Mortensen and Klim 2013) provides functions for estimation of linear and non-linear
mixed-effects models using stochastic differential equations. But the
model includes measurement noise and proposes only parameter estimation.
Moreover, there is no mathematical property about the used estimators.
In this context, the package presented is this paper is pioneer.
Estimation procedures for MSDE have been proposed in the non-parametric
and the parametric frameworks, with a frequentist and a Bayesian point
of view. The parametric approaches assume Gaussian random effects
Three estimation procedures are implemented in the mixedsde package: a
kernel nonparametric estimator (Dion and Genon-Catalot 2015), a parametric maximum likelihood
estimator (Delattre et al. 2013) and a parametric Bayesian estimator (Hermann et al. 2016). The
parametric frequentist and Bayesian approaches assume the random effects
Gaussian. The Bayesian approach seems the most appropriate method for a
small time of observation
This paper reviews in Section 2 the three estimation
methods. An overview of the mixedsde package is given in Section
3 through a description of the main functions and of other
related companion functions. The practical use of this package is
illustrated in Section 4 on simulated data and in
Section 5 on one real dataset in neuronal modeling.
We briefly recall the methodology of the three estimators implemented in the mixedsde package. We start with the nonparametric approach, then the frequentist parametric Gaussian method and finally the Bayesian parametric Gaussian method.
The first step of the nonparametric approach is to estimate the random
effects. The idea is to maximize the likelihood of the process
In the mixedsde package, Gaussian kernel estimators are implemented
with the R
-functions density
(available in package stats) when
kde2d
(available in package MASS (Venables and Ripley 2016)) when bw="ucv"
, or as the default value
given by the rule-of-thumb if the chosen bandwidth is too small. Note
that the estimator is unstable for small variance of the random effects.
It is important to notice that the two random effects are not assumed independent. When there is only one random effect, the fixed parameter has to be entered by the user.
The computation of
In this section and the following one, we assume that the random
parameters
For the bidimensional case
For the case
The likelihood function is defined as
optim
is used to maximize numerically the likelihood. The maximum is
generally not unique and depend on the initialization. A good
initialization is another estimator, for example the moment estimator of
optim
is thus initialized with the mean and the
variance of the estimators
Note that this parametric approach requires the knowledge of
Selection of (non-nested) models can be performed with the BIC criteria,
defined by
Theoretical results are obtained on these estimators in the continuous
observations context under the asymptotic regime
For the Bayesian approach we assume similarly to the frequentist
parametric estimation method a Gaussian distribution for
We now introduce prior distributions implemented in mixedsde package
for the parameters
The aim is to calculate the posterior distribution
Analogically to the frequentist approach, there is a first step: sample
from the full conditional posterior of the random effects
The second step is the estimation of the hierarchical parameters
The last step of the Gibbs sampler is sampling from the full conditional
posterior of
In the case of one random effect, there is one additional Gibbs sampler step for the fixed effect, that is also conducted through a MH algorithm.
In the package, the starting values for the Gibbs sampler are set equal
to the mean of the prior distributions. In all the MH algorithms, one
each has to choose a proposal density. In the package mixedsde, we use
a normal density for all location parameters with mean equal to the last
chain iteration and a proposal variance that has to be chosen. For the
CIR model, the proposal distribution for
In many cases, one is not only interested in parameter estimation but
also in the prediction for future observations. The first step is the
prediction of a future random effect
Given a new random effect
It is then interesting to compare the new trajectories with the real ones. If the number of new trajectories is large enough we compute an empirical confidence interval.
This Section presents an overview of the functions implemented in the package. Illustrations of the code are given in Section 4.
Data is a matrix X
of size times
of length
To lead a simulation study, the function mixedsde.sim
allows to
generate a list with a X
of times
with the equidistant times. This function leans on
function sde.sim
available via package sde (Iacus 2006) to
simulate SDE. One has to choose: model
either OU or CIR; random
that
fixes the position and the number of random effects: random = 1
for
random = 2
for random = c(1,2)
for invariant
, default value 0 means that density.phi
to choose the distribution of the random
effect (see package documentations).
Main function is mixedsde.fit
producing estimation of the random
effects and their common density. Inputs of mixedsde.fit
are
X
a times
The vector of observations times.model
The chosen model either OU or CIR.random
It fixes the position and the number of random effects:
random = 1
for random = 2
for random = c(1,2)
for estim.method
The estimation method: nonparam
(see Section
2.1), paramML
(see Section 2.2) or
paramBayes
(see Section 2.3).fixed
The value of the fixed effect random = 1
(resp. random = 2
), default 0. (Only for the
frequentist approaches).estim.fix
1 if the fixed effect is estimated, default 0. (Only for
the frequentist parametric approach when random=1
or 2).gridf
The x-axis grid on which the random effect distribution is
computed: we recommend a fine grid with at least 200 points, default
value is a sequence of length 500 starting in
prior
The list of prior parameters
m, v, alpha.omega, beta.omega, alpha.sigma,
beta.sigma
for paramBayes
method: Default values are calculated
based on the estimations nMCMC
The length of the Markov chain for paramBayes
method.
(Only for the Bayesian approach).Note that for the frequentist approach if there is only one random
effect, then the user has the choice: fix it to a value of the user
choice (using: fixed
= the value and estim.fix
=0) or estimate it
through the package (choosing estim.fix=1
. In the following we
describe the related methods, proposed in the package, they are
summarized in Table 1.
Class | Freq.fit | Bayes.fit |
---|---|---|
Method | out | out |
Method | plot | plot |
Method | – | plot2compare |
Method | ||
Method | summary | summary |
Method | pred | pred |
Method | valid | valid |
Output of mixedsde.fit
is a S4 class called Freq.fit
for the
frequentist approaches and Bayes.fit
for the Bayesian approach.
Results of the estimation procedure are available as a list applying
function out
to the Freq.fit
(resp. Bayes.fit
) object.
Elements of Freq.fit
are:
sigma2
Estimator estimphi
Estimator estimphi.trunc
The truncated estimator estim.fixed
The estimator of the fixed effect if random = 1
or
2, estim.method = paramML;
estim.fix = 1
, default 0.gridf
The x-axis grid on which the random effect distribution is
computed.estimf
The estimator of the density of the random effects (for
both paramML
method with Equation (7) and nonparam
method with Equation (6)).cutoff
Binary FALSE
when no truncation.estimf.trunc
The truncated estimation of the density of the random
effects.mu
Estimation of Gaussian mean of the random effects (only for
paramML
method from Equation (7)).omega
Estimation of Gaussian variance matrix of the random effects
(only for paramML
method method from Equation (7)).aic
and bic
AIC and BIC criteria (only for paramML
method).index
Indices of trajectories used for the estimation, excluded
are trajectories with CIR
model.Elements of Bayes.fit
are:
sigma2
Trace of the Markov chain simulated from the posterior of
mu
Trace of the Markov chain simulated from the posterior of omega
Trace of the Markov chain simulated from the posterior of
alpha
Trace of the Markov chain simulated from the posterior of
nMCMC
nMCMC
beta
Trace of the Markov chain simulated from the posterior of
nMCMC
nMCMC
burnIn
A proposal for the burn-in phase.thinning
A proposal for the thin rate.ind.4.prior
The indices used for the prior parameter calculation,
Outputs burnIn
and thinning
are only proposals for a burn-in phase
and a thin rate. The proposed burnIn
is calculated by dividing the
Markov chains into 10 blocks and calculate the 95% credibility intervals
and the respective mean. Starting in the first one, the block is taken
as burn-in as long as the mean of the current block is not in the
credibility interval of the following block or vice versa. The thinning
rate is proposed by the first lag which leads to a chain autocorrelation
of less than 80%. It is not easy to automate these choices, so it is
highly recommended by the authors to plot the chains and look at the
mixing property (the chain should not be piecewise constant).
Command plot()
applied to a Freq.fit
object produces a frequencies
histogram of plot()
is applied to a Bayes.fit
object, one can choose four
different options, named style
. The default value is chains
, it
plots the Markov chains for the different parameter values. acf
leads
to the corresponding autocorrelation functions, density
to the
approximated densities for each parameter and cred.int
leads to the
credibility intervals of the random parameters with the input parameter
level
with default 0.05. For all options, with the input parameter
reduced = TRUE
, the burn-in period is excluded and a thinning rate is
taken, default is FALSE
. There is also a possibility to include the
prior means in the plots by lines with plot.priorMean = TRUE
, default
is FALSE
.
In the Bayesian estimation the influence of prior parameters is
interesting, thus for the Bayes.fit
object, there is a second plot
method, named plot2compare
where three estimation objects can be
compared. For reasons of clarity, only the densities are compared, with
the default reduced = TRUE
. Here, there is also a possibility to
include true.values
, a list of the true parameters for the comparison
in a simulation example.
Command summary()
applied to a Freq.fit
object computes the kurtosis
and the skewness of the distribution, Bayes.fit
object, it
computes means and credibility interval (default level 95%) for each
parameter (burnIn
and thinning
.
Command print()
applied to a Freq.fit
object returns the use or not
of the cutoff and the vector of excluded trajectories. When applied to a
Bayes.fit
object, it returns the acceptance rates of the MCMC
procedure.
Validation of a mixed model, obtained with function valid
, is an
individual validation. Indeed, the validation of estimation of
trajectory number
Freq.fit
or Bayes.fit
object.plot.valid
1 to generate a figure (default value is 1).numj
A specific individual trajectory to validate (default:
randomly chosen between 1 and Mrep
The number of simulated trajectories (default value 100).Each observation Mrep
simulated values
Outputs are the list of the
plot.valid
=1, two plots are produced. Left: plot of the Mrep
new
trajectories (black) and the true trajectory number numj
(in
grey/red). Right: quantile-quantile plot of the quantiles of a uniform
distribution and the Mrep
simulated values
This is an empirical method. The recent work (Kuelbs and Zinn 2015) on depth and quantile regions for stochastic processes (see for example (Zuo and Serfling 2000) for depth functions definitions) should provide the theoretical context for a more extensive study. This could be done in further works.
Prediction (see Section 2.4) is implemented in function
pred
. Main inputs of the function are
Freq.fit
or Bayes.fit
object.invariant
TRUE if the new trajectories are simulated according to
the invariant distribution.level
The level of the empiric prediction intervals (default
0.05).plot.pred
TRUE to generate a figure (default TRUE).(and optional plot parameters). Function pred
applied to a Freq.fit
object returns a list with predicted random effects phipred
, predicted
trajectories Xpred
and indexes of the corresponding true trajectories
indexpred
(see Section 2.4 for details of simulation).
If plot.pred = TRUE
(default) three plots are produced. Left predicted
random effects versus estimated random effects. Middle: true
trajectories. Right predicted trajectories and their empirical 95%
prediction intervals (default value level=0.05
). The prediction can
also be done from the truncated estimator pred.trunc = 1
.
Function pred
applied to a Bayes.fit
object returns a S4 class
object Bayes.pred
. The first element of this class is Xpred
, which
depends on the input parameters. Including the input
trajectories = TRUE
, matrix Xpred
contains the trajectories = FALSE
which leads to the calculation of the predictive
distribution explained in Section 2.4. With the input
only.interval = TRUE
(default), only the quantiles for the 1- level
prediction interval are calculated, stored in qu.l
and qu.u
. Input
only.interval = FALSE
provides additionally Xpred
containing
sample.length
(default 500) samples from the predictive distribution
in each time point of the observations (except the first). In both
cases, with plot.pred = TRUE
, two figures are produced. On the left
side, the data trajectories are compared with the prediction intervals
and on the right side, the coverage rate is depicted which is stored in
entry coverage.rate
, namely the amount of series covered by the
prediction intervals for each time point. The last class entry estim
stores the results from the Bayes.fit
object in a list. Other input
parameters are burnIn
and thinning
which allow for the choice of
other burn-in phase and thinning rate than proposed in the Bayes.fit
object.
For the Bayes.pred
class object, two plot methods are available.
plot()
repeats the figures that are created with the
plot.pred = TRUE
command in the pred
method. plot2compare()
compares up to three Bayes.pred
objects, where in a first figure the
prediction intervals are presented in colors black, red and green and
the observed data series in grey and in a second figure the
corresponding coverage rates are compared. With the input parameter
names
a vector of characters to be written in a legend can be
indicated.
Note that to avoid over-fitting, we recommend to use only
In this part two simulated examples are given to illustrate the strengths of each proposed method. Two datasets are simulated according to:
CIR model with one non-Gaussian random effect
R> model1 <- "CIR"; random1 <- 2; fixed1 <- 1; sigma1 <- 0.1 ; M1 <- 200;
R> T1 <- 50; N1 <- 1000; X01 <- 1; density.phi1 <- "gamma";
+ param1 <- c(1.8,0.8);
R> simu1 <- mixedsde.sim(M = M1, T = T1, N = N1, model = model1,
+ random =random1, fixed = fixed1, density.phi = density.phi1,
+ param = param1, sigma = sigma1, X0 = X01)
R> X1<- simu1$X; phi1 <- simu1$phi; times1 <-simu1$times
OU model with one Gaussian random effect
R> model2 <- "OU"; random2 <- 1; sigma2 <- 0.1; fixed2 <- 5; M2 <- 50;
+ T2 <- 1;N2 <- 500; X02 <- 0; density.phi2 <- "normal";
+ param2 <- c(3, 0.5);
R> simu2 <- mixedsde.sim(M = M2, T = T2, N = N2, model = model2,
+ random = random2, fixed = fixed2, density.phi = density.phi2,
+ param = param2, sigma = sigma2, X0 = X02)
R> X2 <- simu2$X; phi2 <- simu2$phi; times2 <- simu2$times
Example 1 has non Gaussian random effect, the nonparametric method is
the most appropriate approach. Example 2 has
We illustrate nonparametric estimation on Example 1. Code for the nonparametric estimation is
R> estim.method <- 'nonparam'
R> estim_nonparam <- mixedsde.fit(times = times1, X = X1, model = model1,
+ random = random1, fixed = fixed1, estim.method = estim.method)
R> outputsNP <- out(estim_nonparam) # stores the results in a list
Summary function provides:
R> summary(estim_nonparam)
[,1] [,2]
[1,] "sigma" "0.099868"
Random effect:
[,1]
empiric mean 1.355403
empiric sd 0.939410
kurtosis 3.695013
skewness 1.083577
As expected kurtosis is larger than 3 and skewness is positive which means that the distribution is right-tail. Figure 1 is provided by
R> plot(estim_nonparam)
Nonparametric estimation fits well the histogram of
Because we are working on simulated data, we can compare the estimations
with the true random effects and the true
# Comparison of the true f and its estimation
R> gridf1 <- outputsNP$gridf
# True density function
R> f1 <- dgamma(gridf1, shape = param1[1], scale = param1[2])
# Nonparametric estimated density function
R> fhat <- outputsNP$estimf
R> plot(gridf1, f1, type='l', lwd=2, xlab='', ylab='')
R> lines(gridf1, fhat, col='red')
# Comparison of the true random effects and their estimations
# Estimated random effects
R> phihat1 <- outputsNP$estimphi
R> plot(phi1, phihat1, type = "p", pch = 18, xlab='', ylab='')
R> abline(0, 1)
This results in Figure 2. On the left plot, the
estimated density (dotted curve) is very close to the true density
Validation of the MSDE is produced by function valid
. The two graphs
on the right of Figure 5 are obtained by
R> validationCIR <- valid(estim_nonparam)
Prediction are obtained with pred
and similar Figure 6
(not shown) can be obtained with
R> predNPCIR <- pred(estim_nonparam)
We present the parametric estimation on Example 2. The code is
# Parametric estimation
R> estim.method<-'paramML';
R> estim_param <- mixedsde.fit(times2, X = X2, model = model2,
+ random = random2, estim.fix = 1, estim.method = 'paramML' )
# Store the results in a list:
R> outputsP <- out(estim_param)
Summary function provides:
R> summary(estim_param)
[,1] [,2]
[1,] "sigma" "0.109144"
Random and fixed effects:
[,1]
estim.fixed 4.914685
empiric mean 2.955582
MLE mean 2.955512
empiric sd 0.536956
MLE sd 0.519955
kurtosis 2.472399
skewness 0.427223
[,1] [,2]
[1,] "BIC" "-3780.383134"
[2,] "AIC" "-3795.335809"
Kurtosis is, as expected, close to 3 and skewness close to 0. The
diffusion parameter
R> plot(estim_param)
The small number of observations makes the estimation harder, nevertheless here, the histogram seems pretty well fitted by the parametrically estimated density.
Because we are working on simulated data, we can compare the estimations
with the true random effects and the true
# Comparison of the true f and its estimation
R> gridf2 <- outputsP$gridf
# True density
R> f2 <- dnorm(gridf2, param2[1], param2[2])
# Parametric estimated density
R> fhat_param <- outputsP$estimf
R> plot(gridf2, f2, type = 'l', lwd = 2, xlab = '', ylab = '')
R> lines(gridf2, fhat_param, col='red', lty = 2, lwd = 2)
# Comparison of the true random effects and their estimations
# Estimated random effects
R> phihat2 <- outputsP$estimphi
R> plot(phi2, phihat2, type="p", pch=18, xlab='', ylab='')
R> abline(0, 1)
This results in Figure 4. It shows that estimation of the density is satisfactory (left) and estimation of the random effects is very good (right).
Validation of the MSDE is produced by function valid
. For example the
individual validation of the first trajectory is plotted Figure
5, the first two graphs on the left, using
R> validationOU <- valid(estim_param)
valid
method. Two top plots: frequentist nonparametric
estimation on example 1 (CIR process). Two bottom plots: frequentist
parametric estimation on example 2 (OU process).
This illustrates the good estimation of the random effects: a beam of trajectories with the true one in the middle and the lining up of the quantiles.
Finally, we can predict some trajectories using pred
. Predictions are
shown on Figure 6, as a result of
R> predPOU <- pred(estim_param)
Beam of 32 predicted trajectories (right) is close to the true ones (middle). The lining up of the predicted random effects versus the estimated random effects (left) shows the goodness of the prediction from the estimated density, thus of the estimation of the density.
Bayesian method is applied to Example 2. Priors are constructed from the true values, but default values can be used.
R> prior2 <- list( m = c(param2[1], fixed2), v = c(param2[1], fixed2),
+ alpha.omega = 11, beta.omega = param2[2] ^ 2 * 10, alpha.sigma = 10,
+ beta.sigma = sigma2 ^ 2 * 9)
R> estim.method <- 'paramBayes'
R> estim_bayes <- mixedsde.fit(times = times2, X = X2, model = 'OU',
+ random = random2, estim.method = estim.method, prior = prior2, nMCMC = 10000)
R> outputsBayes <- out(estim_bayes)
Figure 7 is produced by
R> plot(estim_bayes)
Traces of the Markov chains of
Command print()
yields acceptance rates of the MH algorithm:
R> print(estim_bayes)
acceptance rates for random effect:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.5569 0.5646 0.5676 0.5682 0.5718 0.5805
acceptance rate for fixed effect: 0.4248
The fixed effect
Predictions in the Bayesian framework and the corresponding Figure 8 is obtained by
R> pred.result <- pred(estim_bayes)
Figure 8 shows the beam of simulated data trajectories together with the 95% prediction interval. Coverage rates are shown on the right plot and we see that the intervals hold the level.
A real dataset is available (neuronal.data.rda
) through lists of a
matrix X
and a vector times
. We detail below the analysis of this
dataset, following the next steps: run the two random effects model with
both the parametric and nonparametric procedure; choose the number of
random effects depending on the variability of the estimators
These data are available thanks to Rune Berg and Jufang He. Details on data acquisition can be found in (Lansky et al. 2006).
Neurons are the basement of nervous system and each neuron is connected
with around 1000 other neurons. They are communicating through emission
of electrical signal. We focus on the dynamic of the neuron membrane
potential between two spikes emission measured in volts as the
difference of ions concentration between the exterior and the interior
of the cell. Data are obtained from one single neuron of a pig. Data are
composed of s
]
and observation time is s]
. Data are uploaded using
data("neuronal.data")
. They are presented on Figure 9.
These data have been previously analysed with a Ornstein-Uhlenbeck model
with one additive random effect (
In this new analysis, we assume that both
Our goal is also to compare the two models OU and CIR, both with two random effects, and two approaches: the nonparametric density estimation and the parametric density estimation. Let us remark that for the CIR model the algorithm removes two trajectories: 168 and 224, because they contain negatives values. For two random effects the command is
R> estim <- mixedsde.fit(times, X = X, model = model, random = c(1,2),
+ estim.method = estim.method)
and they can be found in the help data file (command ?neuronal.data
).
We first apply the two frequentist approaches on models with two random
effects. Kurtosis and skewness of the distribution of the estimation
mixedsde.fit
.
Figure 10 gives the 4 estimated marginals. The blue (black)
is for the OU model and the green (grey) for the CIR model. The dotted
lines are the estimations from the parametric method, the plain lines
for the nonparametric estimation. Parametric and nonparametric
estimators are close, except for the second random effect with the OU
model. Indeed, parametric estimation produces a small variance for the
second random effect, suggesting it could be fixed. Would this
assumption be valid, it explains the difference with the nonparametric
estimator which is not stable if the variance is to small. Estimation of
To compare with previous literature results, we focus on the OU model.
To select the number and the position of the random effects, we run the
code with one random effect, additive or multiplicative: random = 1
or
random = 2
, for both models estimating the common fixed parameter with
the parametric frequentist strategy. Estimators of the means
random = 1
and random = c(1,2)
are the best according to the BIC and
AIC criteria.
Finally, in Table 4 we compare the BIC and AIC criteria for
random = 1
when the value of the fixed effect is plugged in: the one
we obtained in Table 3 and to values obtained in
(Picchini et al. 2008) and (Picchini et al. 2010). The preferred model is the one minimizing
both criteria. Thus, the OU model with one additive random effect
The OU model with random = 1
is then validated with valid
function.
Figure 12 illustrates the result for a random trajectory
(number 141): 100 simulated trajectories (black) and true trajectory
Finally some prediction plots are performed (not shown) with the pred
method and they confirm that model OU with random = c(1,2)
with the
parameters obtain from the parametric estimation, and the OU model with
random = 1
and
We then apply the Bayesian procedure. As already mentioned, for the
Bayesian procedure, large data sets are a problem because of the very
long running time. Therefore, we thin the data set by 10. That means,
every 10th data point of the series is used for the estimation and also
for the prediction. Even with this thinning, one estimation with 20000
samples takes half an hour.
Based on the best model selected by the frequentist approach, the OU
model with one random effect
In Figure 14, we can see a comparison of the
prediction results for all three cases,
OU | CIR | |
---|---|---|
Aj1 Kurtosis | 6.17 | 11.70 |
Skewness | 0.96 | 2.32 |
Aj2 Kurtosis | 6.68 | 7.07 |
Skewness | 0.96 | 2.32 |
BIC | AIC | |||||
---|---|---|---|---|---|---|
random=c(1,2) |
0.38 | 0.06 | 37.30 | 1.10 | -3229.67 | -3247.59 |
random=2 |
0.37 | - - | 37.70 | 7.47 | -3082.36 | -3103.40 |
random=1 |
0.38 | 0.06 | 37.22 | - - | -3227.47 | -3248.51 |
BIC | AIC | ||||
---|---|---|---|---|---|
0.27 | 0.04 | 25.64 | -2971.59 | -2980.55 | |
0.47 | 0.08 | 47.00 | -3043.89 | -3052.86 | |
Previous estimator MLE of |
0.38 | 0.06 | 37.22 | -3240.55 | -3249.51 |
In this paper we illustrate the functionality of the package mixedsde
for inference of stochastic differential equations with random and/or
fixed effects. This package, and mainly the function misedsde.fit
, can
be used to choose the best model to fit some data. It allows to compare
two models: OU or CIR with one or two random effects. The three
estimation methods can be used to help the decision maker. Nevertheless
each method can be more appropriate to a specific situation, as
explained before: the Bayesian method is recommended for a small number
of observations, the frequentist nonparametric is a good tool with two
random effects and no prior available. In particular the frequentist
parametric proposes for a large sample, an estimation of the fixed
effect and of the parameters of the Gaussian distribution for the fixed
effect when there is only one. A neuronal dataset is studied with the
three methods. Furthermore, other real data should be investigated with
the present package.
Recently, the parameter estimation method developed in (Delattre et al. 2016) for random effects distributed according to a Gaussian mixture distribution has been implemented in the R package MseParEst (Delattre and Dion 2016).
The authors would like to thank Vincent Brault and Laurent Bergé for technical help on the package. This work has been partially supported by the LabExPERSYVAL-Lab(ANR-11-LABX-0025-01).
The second author, Simone Hermann, was financially supported by Project B5 “Statistical methods for damage processes under cyclic load” of the Collaborative Research Center “Statistical modeling of nonlinear dynamic processes” (SFB 823) of the German Research Foundation (DFG).
When there is one random effect, what is the likelihood function and the
MLE of the fixed effect?
Assume that we are in the case of random = 1
, thus the process is
random=c(1,2)
, where random=1
we get
random = 2
the roles of
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Dion, et al., "mixedsde: A Package to Fit Mixed Stochastic Differential Equations", The R Journal, 2019
BibTeX citation
@article{RJ-2019-009, author = {Dion, Charlotte and Hermann, Simone and Samson, Adeline}, title = {mixedsde: A Package to Fit Mixed Stochastic Differential Equations}, journal = {The R Journal}, year = {2019}, note = {https://rjournal.github.io/}, volume = {11}, issue = {1}, issn = {2073-4859}, pages = {44-66} }