Tackling Uncertainties of Species Distribution Model Projections with Package mopa

Species Distribution Models (SDMs) constitute an important tool to assist decision-making in environmental conservation and planning in the context of climate change. Nevertheless, SDM projections are affected by a wide range of uncertainty factors (related to training data, climate projections and SDM techniques), which limit their potential value and credibility. The new package mopa provides tools for designing comprehensive multi-factor SDM ensemble experiments, combining multiple sources of uncertainty (e.g. baseline climate, pseudo-absence realizations, SDM techniques, future projections) and allowing to assess their contribution to the overall spread of the ensemble projection. In addition, mopa is seamlessly integrated with the climate4R bundle and allows straightforward retrieval and post-processing of state-of-the-art climate datasets (including observations and climate change projections), thus facilitating the proper analysis of key uncertainty factors related to climate data.


Introduction
Species Distribution Models (SDMs) are statistical tools used for the generation of probabilistic predictions of the presence of biological entities in the geographical space (Guisan and Zimmermann, 2000;Elith and et al, 2006).SDMs operate through the establishment of an empirical link between known presence locations and the physical characteristics of their environment.A particular case is that of Climate Envelope Models (CEMs), where appropriate climatic variables are used as predictors to characterize the climatic conditions where a species can potentially live -typically in the form of bioclimatic variables (Nix, 1986;Busby, 1991).In the context of climate change, SDMs have become a valuable tool for the vulnerability and impact assessment community, as a means of estimating distribution shifts due to climate variations, a problem of current interest in environmental conservation studies (see e.g.: Araújo et al., 2004;Hamann and Wang, 2006;Jeschke and Strayer, 2008).These studies require suitable climate products to produce models at an adequate spatial resolution and varying geographical extents -up to global-, including historical climate databases (i.e.high resolution gridded observations) and future climate projections for different emission scenarios.However, the intricacy of climate data retrieval and post-processing of the existing climate products (e.g. the global and regional climate change projections available from the Earth System Grid Federation, ESGF, Taylor et al., 2011) has resulted in a wide use of ready-to-use products without considering their limitations for a particular case study (see Bedia et al., 2013).In this paper we fill this gap with the package mopa (Species Distribution MOdeling with Pseudo-Absences), which has been developed in the framework of the climate4R bundle for climate data access and post-processing, thus facilitating the use of state-of-the-art global and regional climate data for SDM projections.
Despite the increased use of future SDM projections as a support tool for decision-making in biological conservation, the communication of the inherent uncertainties of these products remains as an ongoing challenge (see, e.g.Araújo et al., 2005;Beaumont et al., 2008;Fronzek et al., 2011).A common approach to tackle different sources of uncertainty is based on producing ensembles of future SDM projections that encompass a wide range of variability by considering multiple choices of each of the factors/components involved in the modeling and projection chain (Araújo and New, 2007;Buisson et al., 2010;Bagchi et al., 2013).However, there are important sources of uncertainty that are rarely quantified, yet crucial, in order to assess the credibility of the future distributions, such as the training data (including the baseline climate) used to fit the SDMs characterizing the ecological niche (Mateo et al., 2010;Bedia et al., 2013;Baker et al., 2016), the varying extrapolation ability outside the training period/spatial extent of the different SDM techniques (known as SDM transferability in time/space; Bedia et al., 2011;Fronzek et al., 2011), the Global/Regional Climate Model (GCM/RCM) projections and biases (Turco et al., 2013) and others (see e.g.: Falloon et al., 2014, for an overview).Moreover, the ensemble approach has also limitations, since it assumes that all SDMs are equally transferable to climate change conditions, thus posing the risk of diluting insightful model signals with noise and error from less useful or defective SDMs forming the ensemble (Thuiller et al., 2004;Peterson et al., 2011).
The package mopa here presented has been designed to facilitate the design and analysis of comprehensive multi-factor SDM ensemble experiments, exploring different uncertainty factors such The R Journal Vol.10/1, July 2018 ISSN 2073-4859 as presence data sets, pseudo-absence realizations, baseline climate, modeling algorithms, and future climate projections.Moreover, mopa provides variance partition tools to assess the contribution of the different factors to the overall uncertainty/spread of the ensemble projection.We illustrate the functionality of the package with the case-study presented in Iturbide et al. (2018), focusing on the impact of the pseudo-absence data in the future distribution of a specific Oak phylogenetic group in Europe resulting from an ensemble of SDM projections considering three factors: 1) different SDMs techniques, 2) different realizations of randomly generated pseudo-absence data and 3) different climate projections produced over Europe from an ensemble of RCMs.The analyses undertaken with mopa reveal the sensitivity of SDMs to the pseudo-absence samples, affecting model stability and transferability to new climate conditions, with important implications for the construction of the final ensemble projections.We use and provide publicly available data to guarantee the reproducibility of the results.

mopa and the climate4R bundle for climate data access
The numerous climate databases available (both baselines and future projections) are scattered across many different repositories with various file formats, variable naming conventions, etc. sometimes requiring relatively complex, time-consuming data downloads and error-prone processing steps (e.g.bias correction) prior to SDM development.This is also a major barrier for research reproducibility and data exchange.The climate4R bundle is a set of R packages specifically designed to ease climate data access, analysis and processing in a straightforward manner, tailored to the needs of the impacts and vulnerability assessment community.Further details and references to worked examples and tutorials can be found for instance in Cofino et al. (2017), Bedia et al. (2017) and Frías et al. (2018).With this regard, mopa was developed as part of the climate4R ecosystem, so that typical climate data operations for SDM applications and conversion features to the data type handled by mopa are provided.Additionally, mopa includes a user guide with an end-to-end worked example of climate data retrieval, transformation and SDM development: help(package = "mopa").
The "niche" of mopa within the "SDM ecosystem" in R The popularity of R and its excellent statistical modeling and spatial analysis support has favored the development of specific, well-established and actively maintained packages for SDM construction and analysis, such as sdm (Naimi and Araújo, 2016), biomod2 (Thuiller et al., 2016), dismo (Hijmans et al., 2017) and SDMTools (Van der Wal et al., 2014), some of them also implementing pseudo-absence data generation and ensemble building utilities.For instance, both sdm and biomod2 implement methods for building ensemble projections based on model performance in the calibration phase -e.g. by discarding or weighting the obtained results-.On the contrary, mopa is oriented towards the design and analysis of multi-factor ensembles of future SDM projections (considering as potential factors the presence data sets, the pseudo-absence realizations, the baseline climate, the modeling algorithms, and the future climate projections).The analysis of the resulting ensemble allows, for instance, assessing the problem of SDM transferability, which can not be properly evaluated during model calibration.
Besides, unlike previously existing packages, mopa allows pseudo-absence data generation as an independent step prior to model fitting, thus providing a finer control to the user for the analysis of several alternative methods and specific tuning options.In addition, the novel Three-Step method for pseudo-absence data generation is implemented (TS hereafter, Senay et al., 2013;Iturbide et al., 2015), providing a convenient interface that allows a fine tuning of the technique with simple arguments.Furthermore, mopa is also seamlessly integrated with standard R packages for spatial data manipulation like raster (Hijmans, 2015) and sp (Pebesma and Bivand, 2005;Bivand et al., 2013), allowing their usage at any stage of the modeling process (e.g. for data visualization and post-processing), and also extensibility to other SDM tools available in sdm, biomod2, . . ., also handling the same spatial data classes.

Climate data
SDM predictor variables (in this case-study a number of bioclimatic variables, but not necessarily so) are introduced in the analysis as collections of raster objects of the classes rasterBrick or rasterStack, similarly as other SDM-oriented packages.For instance, here we use a set of present and future bioclimatic variables widely used in SDM applications based on precipitation and temperature climatologies (Busby, 1991), using the function biovars of package dismo.To this aim, we first exploit the climate4R functionalities to load and post-process observed precipitation and The R Journal Vol.10/1, July 2018 ISSN 2073-4859 temperature climatologies from the E-OBS gridded observational dataset (Haylock et al., 2008) and the simulations of 7 Regional Climate Model (RCMs) of the project ENSEMBLES (van der Linden and Mitchell, 2009, http://www.ensembles-eu.org) for the control (20C3M, 1971-2000) and future (A1B, 2071-2100) scenarios, including the application of bias-correction ("delta" method, e.g.Winkler et al., 1997;Zahn and von Storch, 2010).

Species distribution data
Several impact studies indicate that species should be modeled by treating sub-specific groups of organisms independently (e.g.: distinct genetic linages) due to their differing adaptive responses to changes in their environment (Hernandez et al., 2006;Beierkuhnlein et al., 2011;Serra-Varela et al., 2015).Although this is not always possible, due to the rare availability of information on the distribution of sub-specific groups for most of species, mopa has been conceived with this idea in mind, being able to deal with several sets of presences simultaneously.This adds flexibility to the modeling process in order to carry out experiments considering different sub-collections of presences, not only for sub-specific analyses (Iturbide et al., 2015), but also to address the sensitivity of the modeled distributions to different characteristics of the training sample (e.g. the sample size, Hernandez et al., 2006;Mateo et al., 2010).Thus, the Oak_phylo2 mopa dataset contains a named list of length two, containing the geographical coordinates of presence localities for two different Oak phylogenies (H01 and H11, Petit et al., 2002).More details about the source data are provided in the help file of the dataset.

Geographic background
The geographic background is often defined as the spatial extent of the area considered in the SDM calibration stage.Here, we refer to the background as a regular, geo-referenced grid with a specific size and resolution, in which both the environmental variables and the presence localities are located, so its grid-points are the sampling units.Function backgroundGrid provides a simple way of generating a backgroud using a raster-class object as reference.It also includes an additional argument (spatial.subset)for spatial subsetting, set by a raster::extent object or by one or several sets of bounding-box coordinates, providing great flexibility and ease of use for the analysis of SDM spatial aspects.For instance, it allows straightforward exploration of SDM geographical transferability or performing cross-validation experiments based on spatial folds (e.g.: Randin et al., 2006).As a result, when the object Oak_phylo2 is passed to backgroundGrid, two different backgrounds are created by default, each one spatially restricted by its phylogeny distribution (H11 and H01).

> bg <-backgroundGrid(raster = biostack$baseline$bio1)
A smaller domain than the previous one can be arbitrarily indicated by the user by providing a specific spatial extent: > bg.subdomain <-backgroundGrid(raster = biostack$baseline$bio1, spatial.subset= extent(c (-10, 35, 45, 65))) Similarly, the user might be interested in a background strictly constrained by the bounding box of the actual species localities, by just passing to spatial.subsettheir coordinates: > bg.species <-backgroundGrid(raster = biostack$baseline$bio1, spatial.subset= presences) Thus, the user has flexibility to perform further modifications of the background, so it would be also possible to discard specific areas based on expert knowledge (e.g.Serra-Varela et al., 2015).In this case study, we will retain the full background (bg) for further analyses.

Pseudo-absence generation
Most of SDMs require data not only from known presences of the biological entity, but also absence data in order to model the binary response presence/absence as a function of the different environmental variables.While the sampling efforts are typically focused on recording presence localities (atlases, natural history collections, targeted samplings, . . .), in most cases there is no explicit information about the absence of the species.Therefore pseudo-absence generation is often required for SDM construction, by sampling the background of the study domain.Different methods have been proposed to this aim, whose choice has an important effect on the final SDM results, as highlighted in different previous studies (e.g.: Wisz and Guisan, 2009;Iturbide et al., 2015).However, there is no consensus on the best sampling design for generating pseudo-absences.
Pseudo-absence sampling in mopa is performed by the pseudoAbsence function.It implements a wide range of methodologies described in the literature (see Iturbide et al., 2015, for an overview and comparison of methods) for maximum user flexibility, but at the same time its arguments have been kept as simple as possible to ease its application (Table 1).Here, three methods are described: random sampling, random sampling with environmental profiling and the three-step method.Their main characteristics are next briefly described.A more extended explanation can be found in (Iturbide et al., 2015)

Random Sampling (RS).
The RS method is the simplest and most frequent way of generating pseudo-absences (Iturbide et al., 2015).In the next example three times more pseudo-absences than presences are generated at random, keeping a 0.249 • ( 30 km) exclusion buffer around known presence localities.Ten pseudo-absence realizations are considered: > pa_RS <-pseudoAbsences(xy = presences, background = bg$xy, realizations = 10, exclusion.buffer= 0.249, prevalence = -0.5) As an alternative to random sampling, a stratified sampling approach can be performed, based on homogeneous environmental conditions.To this aim, a clustering of the environmental space is applied following Senay et al. (2013) by setting argument kmean to TRUE: > pa_kmeans <-pseudoAbsences(xy = presences, background = bg$xy, exclusion.buffer= 0.249, prevalence = -0.5, kmeans = TRUE, varstack = biostack$baseline) Random Sampling with Environmental Profiling (RSEP).The RSEP method imposes restrictions on the environmental range of the background to be sampled for pseudo-absences.In mopa this is done by performing an environmental profiling of the background (function OCSVMprofiling) that, following Senay et al. (2013), applies a one-class support vector machine algorithm (OCSVM, implemented in package e1071, Meyer et al., 2017) returning a binary (presence/absence) classification of the background gridboxes based solely on the presence information (bg.profiled$presence and bg.profiled$absence in the example below).Only the predicted absence background is then retained for pseudo-absence generation.
> bg.profiled <-OCSVMprofiling(xy = presences, varstack = biostack$baseline, background = bg$xy) The R Journal Vol.10/1, July 2018 ISSN 2073-4859 > pa_RSEP <-pseudoAbsences(xy = presences, background = bg.profiled$absence,realizations = 10, exclusion.buffer= 0.249, prevalence = -0.5)Three-step method (TS).TS is based on imposing restrictions to both the environmental range and the spatial extent of the background from which pseudo-absences are sampled.This method has been shown to outperform other common approaches in terms of resulting SDM robustness (Iturbide et al., 2015).The TS method adds an additional step to the RSEP method, consisting on the partition of the background space (as yielded by RSEP) in multiple bands using different radius from presence localities.In the example below, multiple distance bands with an increasing radius of 30 km between each other are created (argument by = 0.249, in degrees).The first one (with the shortest radius from presence localities) is at 30 km from the closest presence point (start = 0.249), and the largest one (the longest radius from presences) is set by default to half the length of the diagonal of the background bounding-box (see Iturbide et al., 2015, for more details).

Model fitting
Once the pseudo-absence dataset(s) chosen by the user is(are) built, the mopaTrain function performs SDM fitting.The function is a wrapper for different statistical method implementations commonly The R Journal Vol.10/1, July 2018 ISSN 2073-4859 used in SDM applications (see summary in Table 3).Moreover, mopaTrain adds extended functionality for cross-validation for each set of presence/absence data and for each different species contained in the presence dataset, as routinely done in SDM applications (see e.g.: Verbyla and Litvaitis, 1989).In the next line of code, the Oak H1 phylogeny is fitted using a generalized linear model (GLM, Guisan et al., 2002) and multivariate adaptive regression splines (MARS, Friedman, 1991), applying a 10-fold cross validation approach.Moreover, equal weighting of presences and pseudo-absences is indicated with the argument weighting = TRUE (see e.g.: Barbet-Massin et al., 2012).

The special case of model fitting with TS pseudo-absences
After the generation of TS pseudo-absences, multiple background extents exist as a result of the different distances defined by backgroundRadius.It has been noted that the background extent from which pseudo-absences are sampled is an important factor affecting not only model performance, but also its transferability and biological meaning Van der Wal and Shoo (2009).With this regard, Iturbide et al. (2015) propose a selection criterion based on the response of model performance as a function of distance radius, that is generalizable to different SDM characteristics and spatial scales.With this regard, the performance criterion chosen is the Area Under the ROC Curve (AUC), one of the most widely used accuracy measures of binary classification systems (Swets, 1988).Essentially, the method performs a non-linear regression of the AUC obtained by each SDM extent against their background radius, considering three possible asymptotic models (Fig. 2): The R Journal Vol.
, where v and x represent the AUC and the background extent respectively.a is the asymptotic AUC value achieved by the system and a − b is the intercept.Km is the Michaelis constant (i.e. the extent at which the AUC is half of a, and c is the coefficient of the point where the curve is most pronounced.The asymptotic model that better fits the AUC response to the different background extents is automatically selected to extract the AUC asymptotical value.The minimum extent at which the AUC lies above the asymptote is retained as the optimal threshold radius, being the corresponding fitted SDM returned.The asymptotic models are fitted internally by mopaTrain via the nls function from package stats always the TS method is used (this is automatically detected by the function).Optionally, a diagram displaying the results is also returned by setting the argument diagrams=TRUE (Fig. 2).

Model assessment
The object returned by mopaTrain is a list of several components generated in the model calibration and evaluation process.Several performance measures are included apart from the AUC, like the True Skill Statistic (TSS) and Cohen's Kappa obtained in the cross-validation, frequently use for the assessment in SDMs (Allouche et al., 2006).These and other ocmponents of the SDM fitted object can be accessed using extractFromModel.For, instance, to extract the TSS: > tss.RS <-extractFromModel(models = trainRS, value = "tss") However, and for maximum user flexibility, a matrix containing the observed and predicted probability values for each calibration point is returned, allowing other types of user-tailored model performance assessments.The fitted models are stored in the "model" (or "fold.models")component, required for subsequent model prediction.
> models.RS <-extractFromModel(models = trainRS, value = "model") Additionally, variable importance may be also estimated.One straightforward possibility is to pass the fitted models to function varImp from package caret (Kuhn, 2017).

Model predictions
SDM predictions are obtained by passing a new set of predictors (e.g.: future bioclimatic variables) to the generated models.The model component corresponds to the models fitted using all available data for model training, while the SDM predictions for the k-cross-validation setup are generated from the component fold.models-instead of model-.Thus, mopa allows handling both the cross-fitted models for flexible model performance assessment and the global model -fitted with all presences and pseudoabsences-for predicting distributions, accomplished through the use of the function mopaPredict.In the following example, models corresponding to the RS method are projected to reference climate conditions (biostack$baseline) and to 7 future climate projections (biostack$future): > ensemble.present<-mopaPredict(models = models.RS, newClim = biostack$baseline) > ensemble.future<-mopaPredict(models = models.RS, newClim = biostack$future)

Exploring the uncertainty in SDM projections
Projections returned by mopaPredict are structured in a nested list.Each depth or level in the list corresponds to a different component.These are: presence data sets (SP), pseudo-absence realizations (PA), modeling algorithms (SDM), baseline climate (baseClim), and the new climate (newClim) The R Journal Vol.10/1, July 2018 ISSN 2073-4859 used to project models (e.g.future climate projections).The function used to extract components is extractFromPrediction.In the next example, projections corresponding to the first pseudo-absence realization (object rcms_run1) and to the future climate projection from the MPI RCM (object runs_rcm1) are extracted: > rcms_run1 <-extractFromPrediction(ensemble.future,"PA01") > runs_rcm1 <-extractFromPrediction(ensemble.future,"MPI") Then, the function is again applied to object runs_rcm1 to extract the SDM results for MPI and GLM.The resulting object is of S4-class raster*, thus being straightforward to apply any of the plotting/analysis methods for spatial objects.Here, we use spplot from sp for output visualization (Fig. 3).
Thus, it is easy to explore the results by inspecting the different components of the mopaPredict outputs.For instance, the raster package can be particularly useful this aim allowing for a wide variety of map algebra operations through the function stackApply over user-defined subsets of SDM projections.

Partition of the uncertainty into components using ANOVA
The relative contribution of each component to the total ensemble spread/variability is implemented in mopa using an ANOVA approach, through the function varianceAnalysis, following the method in Déqué et al. (2012), also applied by San-Martín et al. (2016).For instance, in this example, the total variance V can be decomposed as the summation of the variance explained by the pseudo-absence realization P, the RCM R and the combination of both PR, so V = P + R + PR.Let i be the index of the pseudo-absence realization (i = 1, . . ., 10), j the index of the RCM (j = 1, . . ., 7), and X ij is the response (e.g.: the predicted distribution for the particular realization and climate projection).Then: are the terms resulting from the realization alone (P), and RCM alone (R), and is the interaction term of the realization with the RCM (PR).The following example shows the analysis performed for the pseudo-absence realizations (component1 = "PA") and the climate projections (component2 = "newClim") in GLM projections (fixed = "glm").In order to illustrate thoroughgoing The R Journal Vol.10/1, July 2018 ISSN 2073-4859 information on the spread in the projected potential distributions, variance percentage maps are returned together with the maps of the mean and standard deviation.Again, the results can be conveniently visualized with function spplot (Figs. 4 and 5).
As a result, in Figure 8, we compare GLM and MARS (var.glm and var.mars) with regard to the variance proportion explained by the RCM choice (component = 2L), so that the percentage not explained by it, is associated to the pseudo-absence realization.From this summary, we can confirm a significantly higher sensitivity of MARS to the pseudo-absence sample across all regions.
The much higher sensitivity of MARS to the pseudo-absence sample warns about its instability, while GLM reveals much better properties in terms of model stability and transferability.These findings are possible after ANOVA analysis thanks to the utilities included in mopa, enabling a flexible experimental setup with a simple user interface.Model transferability is thus not apparent during the SDM calibration stage and is not coupled to model performance (even with the application of the 10-fold cross validation approach), so for instance TSS among realizations was 0.82 for GLM and 0.85 for MARS, and the mean AUC, 0.91 and 0.92 respectively.The uncertainty analysis results are extremely valuable for the construction of an ensemble of SDM projections that minimizes the risk of including unuseful realizations, thus yielding more plausible results.
In the same vein, the contribution of pseudo-absences in front SDM techniques to the overall spread is achieved by adding a new component argument to varianceAnalysis, while the RCM projection (MPI in this example) is kept as a fixed factor: > MPI.var <-varianceAnalysis(ensemble.future, component1 = "PA", component2 = "SDM", fixed = c("MPI")) In case further uncertainty components are considered for predicting distributions (named in mopa as SP, baseClim and foldModels), these could also be analyzed by keeping several fixed factors, each corresponding to a component that is not being analyzed.This is explained in detail in the help document of function "varianceAnalysis".

Summary
The impacts of climate change on the biological systems are of current concern worldwide, and future SDMs have become a key tool for the vulnerability and impact assessment community.Thus, the utilities in package mopa can help in the SDM production chain since the early stage (climate data retrieval and post-processing) to the ultimate phase in which a final set of models is retained for ensemble generation and map production.
In this case-study, we illustrate the development of a set of SDM projections considering multiple combinations of climate change projections from a set of state-of-the-art RCMs, two popular statistical modeling methods (GLM and MARS) and pseudo-absence realizations, enabling the identification of those members of the ensemble yielding consistent and plausible future estimates for final SDM building.With this regard, the ability to quantitatively assess the individual contribution of each factor to the overall SDM spread, as implemented in function varianceAnalysis proved crucial in the evaluation.While previously existing R packages already provide functionality for SDM building and their assessment during the calibration stage, we have shown that model performance, as evaluated by ordinary cross-validation, is not coupled to model transferabilty into future climate, being therefore this essential feature specific of mopa.Other characteristic aspects introduced by the package consist of the novel methods for pseudo-absence generation, and the ability to perform a fine-tuning of these methods prior to model fitting.Furthermore, the inter-operability of mopa with other SDM-related R packages enables maximum flexibility and eases the use of R for SDM applications in the framework of complex modeling exercises, for which multiple aspects have a varying contribution to the overall uncertainty.

Figure 1 :
Figure 1: Pseudo-absence dataset maps, as generated by function pseudoAbsences.(a) Known presence locations of the Oak phylogeny H1 (red points) and initial background for pseudo-absence sampling (grey grid points).(b) pseudo-absences generated using the RS method randomly (purple points) and with k-means clustering (yellow points).(c) Pseudo-absences generated with the RSEP method (purple), where the turquoise area corresponds to the discarded suitable background space as identified by the OCSVM profiling approach.(d) TS approach.Environmentally stratified as RSEP (c), but also spatially stratified background, the different strata (spatial extents) identified by the different gray-scale colors.Pseudo-absences for one of the background extents (3120 km) are depicted as example (purple points).

>Figure 2 :
Figure2: Asymptotic model fitting in SDMs using the TS approach for pseudo-absence generation.The blue points are the AUC values (y-axis) obtained by the SDMs for different background radius extents (x axis).Non-linear fits to the three asymptotic models considered (Michaelis Menten, 2 and 3-parameter exponential).The vertical and horizontal lines indicate the optimal radius and resulting AUC value of the final mopaTrain SDM output.

Figures 4
Figures 4 and 5 depict the ensemble SDM projections and the variance analysis results, applied to the set of projections that correspond to the 10 pseudo-absence realization and 7 climate projections (10 realizations x 7 RCMs).The mean suitability map and the standard deviation are shown in Figure4, while Figure5are the variance fraction maps (%), depicting the contribution of each component (realization, RCM and realization & RCM) to the overall variance.For instance, the results displayed in Figure5unveil that the RCM choice (component newClim) is by far the most important factor contributing to the ensemble spread, while pseudo-absence realization has some impact in areas that are outside the current domain of the Oak phylogeny H1 (e.g.Scandinavia).

Figure 6 :
Figure 6: Same as Fig. 4, but considering MARS instead of GLM as statistical modeling technique for SDM production (object var.mars$mean).

Figure 7 :
Figure 7: Same as Fig. 5, but considering MARS instead of GLM as the statistical modeling technique for SDM production (object var.mars$var).

Figure 9 :
Figure 9: Future ensemble forecast (mean and standard deviation) of the suitability of the oak phylogeny H11 under climate conditions given by 7 different RCMs.

Table 1 :
and reference therein.Arguments of function pseudoAbsences controlling the parameter values involved in pseudoabsence generation.

Table 2 :
Combinations of functions OCSVMprofiling and backgroundRadius for background definition.These are used prior to pseudo-absence data generation with function pseudoAbsences, that controls the different sampling methods.

Table 3 :
SDM techniques available in mopa through the function mopaTrain.The corresponding algorithm argument values are also indicated.
1. Michaelis-Menten model: v Summary of the variance analysis results generated function varianceSummary, where GLM and MARS techniques (brown and blue respectively) are compared.Boxes account for the spatial spread of the results in each region.Empty boxes show the variance proportion explained by the RCM choice (component newClim) and filled boxes show the overall spread, this is, the standard deviation of the predicted probability expressed as a percentage.Thus, empty boxes show how is the total spread (filled boxes) distributed between components (PA and newClim).The x-axis corresponds to the regions shown in the map at the top.
The R Journal Vol.10/1, July 2018 ISSN 2073-4859 climatic regions defined in the EU-funded PRUDENCE project