distr6: R6 Object-Oriented Probability Distributions Interface in R

distr6 is an object-oriented (OO) probability distributions interface leveraging the extensibility and scalability of R6, and the speed and efficiency of Rcpp. Over 50 probability distributions are currently implemented in the package with `core' methods including density, distribution, and generating functions, and more `exotic' ones including hazards and distribution function anti-derivatives. In addition to simple distributions, distr6 supports compositions such as truncation, mixtures, and product distributions. This paper presents the core functionality of the package and demonstrates examples for key use-cases. In addition this paper provides a critical review of the object-oriented programming paradigms in R and describes some novel implementations for design patterns and core object-oriented features introduced by the package for supporting distr6 components.


Introduction
Probability distributions are an essential part of data science, underpinning models, simulations, and inference. Hence, they are central to computational data science. With the advent of modern machine learning and AI, it has become increasingly common to adopt a conceptual model where distributions are considered objects in their own right, as opposed to primarily represented through distribution defining functions (e.g., cdf, pdf), or random samples. distr6 is an object oriented implementation of these conceptual models, and allows manipulation, combination, and inspection of distributions as objects with defined properties and methods.
To appreciate the object oriented conceptual model for distributions in more detail, it is important to conceptualize, and distinguish some mathematical concepts which are similar and thus often conflated: • a random variable, distributed according to a certain distribution, e.g., X ∼ Normal(0, 1) • the cdf of that random variable X, usually denoted by F X , a function F X : R → [0, 1] • the pdf of that random variable X, often denoted by f X , a function f X : R → [0, ∞) • the distribution according to which X is distributed -often called 'the law of' X. This can be represented by multiple mathematical objects, such as the cdf F X or the pdf f X . We will call this distribution d. Note that d is not identical to either these representation functions.
A full mathematical definition of the conceptual model is given in the next section.
Critically, we highlight that random variables and distributions are neither identical objects nor concepts. A random variable X has distribution d, and multiple random variables may be distributed according to d. Further, random variables are sampled from, while the distribution is only a description of probabilities for X. Thus, X and d are not identical objects. Figure 1 visually summarizes these differences.
As a possible logical consequence of the above, we adopt the conceptual model that a distribution is an abstract object, which: • Has multiple defining representations, for example through cdf and possibly through pdf, but is not identical with any of these representations • Possesses traits, such as being absolutely continuous over the Reals, and properties, such as skewness and symmetry. • Can be used to define sampling laws of random variables, but is not conceptually identical with a random variable.
Abstracting distributions as objects from multiple, non-identical, representations (random variables), introduces major consequences for the conceptual model: (i) It lends itself naturally to a class-object representation, in the computer scientific sense of object oriented programming. Abstract distributions become classes, concrete distributions are objects, and distribution defining functions are methods of these classes. Random variables are a separate type of object. (ii) It strongly suggests adoption of mathematical conceptualization and notation which cleanly separates distributions from random variables and distribution defining functions -in contrast to common convention where random variables or random sampling takes conceptual primacy above all.
(iii) It allows clean formulation of algorithmic manipulations involving distributions, especially higher-order constructs (truncation, huberization, etc.), as well as clean mathematical definitions.

Distributions as software objects and mathematical objects
In distr6, distributions are first-class objects subject to an object oriented class-object representation. For example, a discrete uniform distribution ( fig. 1b) is a 'class' with traits such as type (Naturals), and variate form (univariate). With a given parametrization, this becomes an 'object' with properties including symmetry and support. An alternative definition to the conceptual model of distributions is now provided.
On the mathematical level, we again consider distributions as objects in their own right, not being identical with a cdf, pdf, or measure, but instead 'having' these as properties.
For a set Y (endowed with suitable topology), we define Distr(Y ) as a set containing formal objects d which are in bijection to (but not identical with) probability measures over Y. Elements of Distr(Y ) are called distributions over Y. We further define formal symbols which, in case of existence, denote 'aspects' that such elements have, in the following way: the symbol d.F, for example, denotes the cdf of d, which is to be read as the 'F' of d, with F in this case to be read as a modifier to a standard symbol d, rather than a fixed, bound, or variable symbol. In this way, we can define: (i) d.F for the cdf of d. This typically exists if Y ⊆ R n for some n, in which case d.F is a function of type d.F : R n → [0, 1]. (ii) d. f for the pdf of d. This exists if Y ⊆ R n , and the distribution d is absolutely continuous over Y. In this case, d. f is a function of type d. f : R n → [0, ∞). (iii) d.P for the probability measure that is in bijection with d. This is a function d.P : F → [0,1] where F is the set of measurable sub-sets of Y.
We would like to point out that the above is indeed a full formal mathematical definition of our notion of distribution. While distributions, defined this way, are not identical with any of the conventional mathematical objects that define them (cdf, pdf, measures), they are conceptually, formally, and notationally well-defined. Similarly, the aspects (d.F, d. f , etc) are also well-defined, since they refer to one of the conventional mathematical objects which are well-specified in dependence of the distribution (in case of existence).
This notation provides a more natural and clearer separation of distribution and random variables and allows us to talk about and denote concepts such as 'the cdf of any random variable following the distribution d' with ease (d.F), unlike classical notation that would see one define X ∼ d and then write F X . Our notation more clearly follows the software implementation of distributions.
For example, in distr6, the code counterpart to defining a distribution d which is Gaussian with mean 1 and variance 2 is The pdf and cdf of this Gaussian distribution evaluated at 2 are obtained in code as > d$pdf(2) > d$cdf (2) which evaluates to 'numerics' that represent the real numbers d. f (2) and d.F(2).
The consideration of distributions as objects, and their separation from random variables as objects, is notably distinct from R stats, which implements both distribution and random variable methods by the 'dpqr' functions. Whilst this may allow very fast generation of probabilities and values, there is no support for querying and inspection of distributions as objects. By instead treating the dpqr functions as methods that belong to a distribution object, distr6 encapsulates all the information in R stats as well as distribution properties, traits, and other important mathematical methods. The object orientation principle that defines the architecture of distr6 is further discussed throughout this manuscript.
Treating distributions as objects is not unique to this package. Possibly the first instance of the object oriented conceptualization is the distr (Ruckdeschel et al., 2006) family of packages, of which distr6 is the 'official' upgrade. distr6 is the first such package to use the full object orientation paradigm R6 (Chang, 2018), with other distribution related packages using S3 or S4. The choice of R6 over S3 and S4 is discussed in detail in section 2.5.1. This choice allows distr6 to fully leverage the conceptual model, and make use of core R6 functionality. As well as introducing fundamental object-oriented programming (OOP) principles such as abstract classes, and tried and tested design patterns (Gamma et al., 1996) including decorators, wrappers, and compositors (see section 2.5.3).
Besides an overview to distr6's novel approach to probability distributions in R, this paper also presents a formal comparison of the different OOP paradigms, while detailing the use of relevant design patterns.

Motivating example: Higher-order distribution constructs
The strength of the object oriented approach, both on the algorithmic and mathematical side, lies in its ability to efficiently express higher-order constructs and operations: actions between distributions, resulting in new distributions. One such example is mixture distributions. In the distr6 software interface, a MixtureDistribution is a higher-order distribution depending on two or more other distributions. For example take a uniform mixture of two distributions distr1 and distr2: Internally, the dependency of the constructs on the components is remembered so that my_mixt is not only evaluable for cdf (and other methods), but also carries a symbolic representation of its construction and definition history in terms of distr1 and distr2.
On the mathematical side, the object oriented formalism allows clean definitions of otherwise more obscure concepts, for example the mixture distribution is now defined by: For distributions d 1 , . . . , d m over R n and weights w 1 , . . . , w m , we define the mixture of d 1 , . . . , d m with weights w 1 , . . . , w m to be the unique distributiond such thatd.
for any x ∈ R n . Note the added clarity by defining the mixture on the distribution d i , i.e., a first-order concept in terms of distributions.

Related software
This section provides a review to other related software that implement probability distributions, this is focused on, but not limited to, software in R.

R stats, actuar, and extraDistr
The core R programming language consists of packages for basic coding and maths as well as the stats package for statistical functions. stats contains 17 common probability distributions and four lesser-known distributions. Each distribution consists of (at most) four functions: dX,pX,qX,rX where X represents the distribution name. These correspond to the probability density/mass, cumulative distribution, quantile (inverse cumulative distribution) and simulation functions respectively. Each is implemented as a separate function, written in C, with both inputs and outputs as numerics. The strength of these functions lies in their speed and efficiency, there is no quicker way to find, say, the pdf of a Normal distribution than to run the dnorm function from stats. However, this is the limit of the package in terms of probability distributions. As there is no physical distribution object, there is no way to query results from the distributions outside of the 'dpqr' functions.
Several R packages implement dpqr functions for extra probability distributions. Of particular note are the extraDistr (Wolodzko, 2019) and actuar (Dutang et al., 2008) packages that add over 60 distributions between them. Both of these packages are limited to dpqr functions and therefore have the same limits as R stats. distr distr was the first package in R to implement an object-oriented interface for distributions, using the S4 object-oriented paradigm. distr tackles the two fundamental problems of stats by introducing distributions as objects that can be stored and queried. These objects include important statistical results, for example the expectation, variance and moment generating functions of a distribution. The distr family of packages includes a total of five packages for object-oriented distributions in R. distr has two weaknesses that were caused by using the S4 paradigm. Firstly, the package relies on inheritance, which means that large class trees exist for every object and extensibility is therefore non-trivial. The second weakness is that S4 objects are not referred to by 'pointers' but instead copies. This means that a simple mixture of two distributions is just under 0.5Mb in size (relatively quite large). distributions3 distributions3 (Hayes and Moller-Trane, 2019) defines distributions as objects using the S3 paradigm. However, whilst distributions3 treats probability distributions as S3 objects, it does not add any properties, traits, or methods and instead uses the objects solely for dpqr dispatch. In terms of comparison to distr, distributions3 removes features and 'downgrades' the paradigm from S4 to S3. mistr mistr (Sablica and Hornik, 2019) is another recent distributions package, which is also influenced by distr. The sole focus of mistr is to add a comprehensive and flexible framework for composite models and mixed distributions. Similarly to distributions3 the package uses a S3 framework and also implements distributions as objects, an overlap in the packages.

Distributions.jl
Despite not being a package written in R, the Julia Distributions.jl (Lin et al., 2019) package provided inspiration for distr6. Distributions.jl implements distributions as objects with statistical properties including expectation, variance, moment generating and characteristic functions, and many more. This package uses multiple inheritance for 'valueSupport' (discrete/continuous) and 'variateForm' (univariate/multivariate/matrixvariate). Every distribution inherits from both of these, e.g. a distribution can be 'discrete-univariate', 'continuous-multivariate', 'continuous-matrixvariate', etc. The package provides a unified and user-friendly interface, which was a helpful starting point for distr6.
Design principles distr6 was designed and built around the following principles.

D1) Unified interface
The package is designed such that all distributions, no matter how complex, have an identical user-facing interface. This helps make the package easy to navigate and the documentation simple to read. Moreover it minimises any confusion resulting from using multiple distributions. A clear inheritance structure also allows wrappers and decorators to have the same methods as distributions, which means even complex composite distributions should be intuitive to use. Whether a user constructs a simple Uniform distribution, or a mixture of 100 Normal distributions, the same methods and fields are seen in both objects.
D2) Separation of core/exotic and numerical/analytic Via abstraction and encapsulation, core statistical results (common methods such as mean and variance) are separated from 'exotic' ones (less common methods such as anti-derivatives and p-norms). Similarly, implemented distributions only contain analytic results; users can impute numerical results using decorators. This separation has two benefits: 1) a less-technical user can guarantee precision of results as they are unlikely to use numerical decorators; 2) a user has access to the most important distribution methods immediately after construction but is not overwhelmed by many 'exotic' methods that they may never want to use. Use of decorators and wrappers allow the user to manually expand the interface at any time. For example a user can choose between an undecorated Binomial distribution, with common methods such as mean and variance, or they can decorate the distribution to additionally gain access to survival and hazard functions.

D3) Inheritance without over-inheritance
The class structure stems from a series of a few abstract classes with concrete child classes, which allows for a sensible, but not over-complicated, inheritance structure. For example all implement distributions inherit from a single parent class so common methods can be unified and only coded once however there is no separation of distributions into a 'continuous distribution' and 'discrete distribution' class. By allowing extension of classes by decorators and wrappers, and not solely inheritance, the interface is highly scalable and extensible. All decorators and wrappers in distr6 stem from abstract classes, which in turn inherit from the Distribution super-class. In doing so, any method of expanding an object's interface in distr6 (i.e. via decorators, wrappers or inheritance) will automatically lead to an interface that inherits from the top-level class, maintaining the principle of a unified interface (D1).

D4) Inspection and manipulation of multiple parameterisations
The design process identified that use of distributions in R stats is inflexible in that in the majority of cases, only one parameterisation of each distribution is allowed. This can lead to isolating users who may be very familiar with one parameterisation but completely unaware of another. For example the use of the precision parameter in the Normal distribution is typically more common in Bayesian statistics whereas using the variance or standard deviation parameters are more common in frequentist statistics. distr6 allows the user to choose from multiple parameterisations for all distributions (where more than one parameterisation is possible/known). Furthermore, querying and updating of any parameter in the distribution is allowed, even if it was not specified in construction (section 2.4). This allows for a flexible parameter interface that can be fully queried and modified at any time.

D5) Flexible interfacing for technical and non-technical users
Throughout the design process, it was required that distr6 be accessible to all R users. This was a challenge as R6 is a very different paradigm from S3 and S4. To reduce the learning curve, the interface is designed to be as user-friendly and flexible as possible. This includes: 1) a 'sensible default principle' such that all distributions have justified default values; 2) an 'inspection principle' with functions to list all distributions, wrappers, and decorators. As discussed in (D2), abstraction and encapsulation allow technical users to expand any distribution's interface to be as arbitrarily complex as they like, whilst maintaining a minimal representation by default. Where possible defaults are 'standard' distributions, i.e. with location 0 and scale 1, otherwise sensible defaults are identified as realistic scenarios, for example Binomial(n = 10,p = 0.5).
D6) Flexible OO paradigms Following from (D5), we identified that R6 is still relatively new in R with only 314 out of 16, 050 packages depending on it (as of July 2020). Therefore this was acknowledged and taken into account when building the package. R6 is also the first paradigm in R with the dollar-sign notation (though S4 uses '@' notation) and with a proper construction method. Whilst new users are advised to learn the basics of R6, S3 compatibility is available for all common methods via R62S3 (Sonabend, 2019). Users can therefore decide on calling a method via dollar-sign notation or dispatch, the example below demonstrates 'piping' and S3.
As the core package is built on R6, the thin-wrappers provided by R62S3 do not compromise the above design principles. Concrete distributions, kernels, and wrappers are the grandchildren of Distribution, and children of one of the mid-layer abstract classes: • SDistribution, which models abstract, generic distributions. Concrete distributions, such as Normal which models the normal distribution, inherit from SDistribution. • Kernel, which models probability kernels, such as Triangular and Epanechnikov. Probability kernels are absolutely continuous distributions over the Reals, with assumed mean 0 and variance 1. • DistributionWrapper, which is an abstract parent for higher-order operations on distributions, including compositions, that is, operations that create distributions from other distributions, such as truncation or mixture. • DistributionDecorator, whose purpose is supplementing methods to distributions in the form of a decorator design pattern, this includes methods such as integrated cdf or squared integrals of distribution defining functions.
Horizontal lines separate mathematical, property, parameter, and representation methods.
The UML diagram above visualises the key class structure of distr6 including the concrete Distribution parent class, from which all other classes in the package inherit from (with the exception of the ParameterSet). These abstract classes allow simple extensibility for concrete sub-classes.

The Distribution interface
The base, or top-level, class in distr6 is the Distribution class. Its primary function is to act as a parent class for the implemented probability distributions and higher-order compositions, it is also utilised for creation of custom distributions. By design, any distribution already implemented in distr6 will have the same interface as a user-specified custom distribution, ensuring (D1) is upheld. A table of the most important methods for a distribution are in Table 1 alongside their meaning and definitions (mathematical if possible). The two use-cases for the Distribution class are discussed separately.

Distribution for inheritance
It is anticipated that the majority of distr6 users will be using the package for the implemented distributions and kernels. With this in mind, the Distribution class defines all variables and methods common to all child classes. The most important of these are the common analytical expressions and the dpqr public methods. Every concrete implemented distribution/kernel has identical public dpqr methods that internally call private dpqr methods. This accounts for inconsistency's occurring from packages returning functions in different formats and handling errors differently; a problem most prominent in multivariate distributions. Another example is handling of non-integer values for discrete distributions, in some packages this returns 0, in others the value is rounded down, and in others an error is returned. The dpqr functions for all distributions have unified validation checks and return types (numeric or data.table). In line with base R and other distribution packages, distr6 implements a single pdf function to cover both probability mass and probability density functions. (1:2) [1] 0.24197072 0.05399097 > Binomial$new()$cdf(1:2, lower.tail = FALSE, log.p = TRUE, simplify = FALSE) Binom 1: -0.01080030 2: -0.05623972 A key design principle in the package is separation of analytical and numerical results (D2), which is ensured by only including analytical results in implemented distributions. Missing methods in a distribution therefore signify that no closed-form expression for the method is available, however all can be numerically estimated with the CoreStatistics decorator (see section 2.4.2). Ideally, all distributions will include analytical methods for the following: probability density/mass function (pdf), cumulative distribution function (cdf), inverse cumulative distribution function/quantile function (quantile), simulation function (rand), mean, variance, skewness, (excess) kurtosis, and entropy of the distribution (mean,variance,skewness,kurtosis,entropy), as well as the moment generating function (mgf), characteristic function (cf), and probability generating function (pgf). Speed is currently a limitation in distr6 but the use of Rcpp (Eddelbuettel and Francois, 2011) in all dpqr functions helps mitigate against this.
The fourth design principle of distr6 ensures that multiple parameterisations of a given distribution can be both provided and inspected at all times. For example the Normal distribution can be parametrised in terms of variance, standard deviation, or precision. Its constructor takes this into account: Normal$new(mean = 0, var = 1, sd = NULL, prec = NULL, decorators = NULL) To avoid conflicting parameterisations, all distributions have a 'right-to-left' priority when multiple parameterisations are possible. This is best demonstrated by example: # prec has priority > Normal$new(var = 1, sd = 2, prec = 1/3)$getParameterValue("var") [1] 3 # prec is not supplied so sd has priority > Normal$new(var = 1, sd = 2)$getParameterValue("var") [1] 4 # only var supplied > Normal$new(var = 1)$getParameterValue("var") The same principle is used for parameter setting with $setParameterValue. Possible parameterisations and their prioritisation are carefully documented and also can be seen from $parameters. The example above utilised the getParameterValue and setParameterValue methods for getting and setting parameter values respectively. The former takes a single argument, the parameter name, and the second a named list of arguments corresponding to the parameter name and the value to set. The example below demonstrates this for a Gamma distribution. Here the distribution is constructed, the shape parameter is queried, both shape and rate parameters are updated and the latter queried, finally the scale parameter is set which auto-updates the rate parameter.
Not all points in {-1} lie in the distribution domain (N0).
These domains and types are returned along with other important properties and traits in a call to properties and traits respectively, this is demonstrated below for the Arcsine distribution. The Distribution constructor is given by Distribution$new(name = NULL, short_name = NULL, type = NULL, support = NULL, + symmetric = FALSE, pdf = NULL, cdf = NULL, quantile = NULL, rand = NULL, + parameters = NULL, decorators = NULL, valueSupport = NULL, variateForm = NULL, + description = NULL) The name and short_name arguments are identification for the custom distribution used for printing. type is a trait corresponding to scientific type (e.g. Reals, Integers,...) and support is the property of the distribution support. Distribution parameters are passed as a ParameterSet object, this defines each parameter in the distribution including the parameter default value and support. The pdf/cdf/quantile/rand arguments define the corresponding methods and are passed to the private .pdf/.cdf/.quantile/.rand methods, as above the public methods are already defined and ensure consistency in each function. At a minimum users have to supply the distribution name, type and either pdf or cdf, all other information can be numerically estimated with decorators (see section 2.4.2).

DistributionDecorator
Decorators add functionality to classes in object-oriented programming. These are not natively implemented in R6 and this novel implementation is therefore discussed further in section 2.5.3. Decorators in distr6 are only 'allowed' if they have have at least three methods and cover a clear use-case, this prevents too many decorators bloating the interface. However by their nature, they are lightweight classes that will only increase the methods in a distribution if explicitly requested by a user. Decorators can be applied to a distribution in one of three ways: In construction: > N <-Normal$new(decorators = c("CoreStatistics", "ExoticStatistics")) Using the decorate() function: > N <-Normal$new() > decorate(N, c("CoreStatistics", "ExoticStatistics")) Using the $decorate method inherited from the DistributionDecorator super-class:

> N <-Normal$new() > ExoticStatistics$new()$decorate(N)
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 The first option is the quickest if decorators are required immediately. The second is the most efficient once a distribution is already constructed. The third is the closest method to true OOP but does not allow adding multiple decorators simultaneously.
Three decorators are currently implemented in distr6, these are briefly described.
CoreStatistics This decorator imputes numerical functions for common statistical results that could be considered core to a distribution, e.g. the mean or variance. The decorator additionally adds generalised expectation (genExp) and moments (kthmoment) functions, which allow numerical results for functions of the form E[ f (X)] and for crude/raw/central K moments. The example below demonstrates how the decorate function exposes methods from the CoreStatistics decorator to the Normal distribution object.
> n <-Normal$new(mean = 2, var = 4) > n$kthmoment(3, type = "raw") Error: attempt to apply non-function > decorate(n, CoreStatistics) > n$kthmoment(3, type = "raw") [1] 32 ExoticStatistics This decorator adds more 'exotic' methods to distributions, i.e. those that are unlikely to be called by the majority of users. For example this includes methods for the p-norm of survival and cdf functions, as well as anti-derivatives for these functions. Where possible, analytic results are exploited. For example, this decorator can implement the survival function in one of two ways: either as i) 1 minus the distribution cdf, if an analytic expression for the cdf is available; or ii) via numerical integration of the distribution.
FunctionImputation This decorator imputes numerical expressions for the dpqr methods. This is most useful for custom distributions in which only the pdf or cdf is provided. Numerical imputation is implemented via Rcpp.

Composite distributions
Composite distributions -that is, distributions created from other distributions -are common in advanced usage. Examples for composites are truncation, mixture, or transformation of domain. In distr6, a number of such composites are supported. Implementation-wise, this uses the wrapper OOP pattern, which is not native to R6 but part of our extensions to R6, discussed in section 2.5.3.
As discussed above, wrapped distributions inherit from Distribution thus have an identical interface to any child of SDistribution, with the following minor differences: • The wrappedModels method provides a unified interface to access any component distribution. • Parameters are still accessed via the same method but stored in a ParameterSetCollection object instead of a ParameterSet, thus allowing efficient representation of composite and nested parameter sets.
Composition can be iterated and nested any number of times, consider the following example where a mixture distribution is created from two distributions that are in turn composites -a truncated Student T, and a huberized exponential -note too the parameter inspection and automatic prefixing of distribution 'short names' to the parameters for identification:   Example code to obtain a truncated or huberized distribution is below. Here, we construct a truncated normal with truncation parameters -1 and 1, and a huberized Binomial with bounding parameters 2 and 5. Vectorization of distributions A special feature of distr6 is that it allows vectorization of distributions -i.e., vectorized representation of multiple distributions in an array-like structure. This is primarily done for computational efficiency with general best R practice of vectorisation. Vectorisation of distr6 distributions is implemented via the VectorDistribution which is logically treated as a compositor.

Class Parameters
Mathematically, a VectorDistribution is simply a vector of component distributions d 1 , . . . , d N that allows vectorized evaluation. Two kinds of vectorized evaluation are supported -paired and product vectorization, which we illustrate in the case of cdfs. VectorDistribution allows for efficient vectorisation across both the distributions and points to evaluate, which we believe is a feature unique to distr6 among distribution frameworks in R.
Example code for vectorization via VectorDistribution is below. Here, we show firstly how to create a vector of two normal distributions evaluated at different points (product mode), and the second demonstrates evaluation at the same points (paired mode).
Two different choices of constructors are provided, the first 'distlist' constructor passes distribution objects into the constructor, whereas the second passes a reference to the distribution class along with the parameterisations. Therefore the first allows different types of distributions but is vastly slowly as the various methods have to be calculated individually, whereas the second only allows a single class of distribution at a time, but is much quicker in evaluation. In the example below, the mixture uses the second constructor and the product uses the first.

Design patterns and object-oriented programming
This paper has so far discussed the API and functionality in distr6. This section discusses objectoriented programming (OOP), firstly a brief introduction to OOP and OOP in R and then the package's contributions to the field.

S3, S4, and R6
R has four major paradigms for object-oriented programming: S3, S4, reference classes (R5), and most recently, R6. S3 and S4 are known as functional object-oriented programming (FOOP) paradigms whereas R5 and R6 move towards class object-oriented programming (COOP) paradigms (R6) (Chambers, 2014). One of the main differences (from a user-perspective) is that methods in COOP are associated with a class whereas in FOOP, methods are associated with generic functions. In the first case methods are called by first specifying the object and in the second, a dispatch registry is utilised to find the correct method to associate with a given object. S3 introduces objects as typed lists in R, which can hold functions or variables. The functions are called via the dispatch system and every function comprises both a generic and a method for each object. S3 has a vague concept of constructing an object, but as there is no formal definition of a class. S3 is embedded deep in the infrastructure of R and single dispatch is behind a vast majority of the base functionality and it is part of the main reason why R is easily readable. However, S3 is not a formal OOP language 1 and lacks the concept of classes, constructors and thereby inheritance (although this is possible it isn't well formalised). S4 formalises S3 by introducing the basics of object-oriented programming including the distinction between classes and objects as well as multiple inheritance. S4 formalises class-object separation by constructor functions that exist independently from the class definition. S4 has more syntax for the user to learn and a few more steps in class and method definitions. S4 syntax is quite clunky and not overly user-friendly. In practice, S3 is used vastly more than S4 (Chambers, 2014).
There is a big jump from S3 and S4 to R6 as they transition from functional-to class-object-oriented programming. This means new notation, semantics, syntax, and conventions. The key changes are: 1) introducing methods and fields that are associated with classes not functions; 2) mutable objects with copy-on-modify semantics; and 3) new dollar-sign notation. In the first case this means that when a class is defined, all the methods are defined as existing within the class, and these can be accessed at any time after construction. Methods are further split into public and private, as well as active bindings; which incorporates the abstraction part of OOP. The mutability of objects and change to copy-on-modify means that to create an independent copy of an object, the new method clone(deep = TRUE) has to be used, which would be familiar to users who know more classical OOP but very different to most R users. Finally methods are accessed via the dollar-sign, and not by calling a function on an object.
Contrasting the three paradigms with a toy example to create a 'duck' class with a method 'quack':
The example clearly highlights the extra code introduced by S4 and the difference between the S3 dispatch and R6 method system.
Comparing the paradigms There is no doubt that R6 is the furthest paradigm from conventional R usage and as such there is a steep learning curve for the majority of R users. However R6 will be most natural for users coming to R from more traditional OOP languages. In contrast, S3 is a natural FOOP paradigm that will be familiar to all R users (even if they are not aware that S3 is being used). S4 is an unfortunate midpoint between the two, which whilst being very useful, is not particularly user-friendly in terms of programming classes and objects. distr was developed soon after S4 was released and is arguably one of the best case-studies for how well S4 performs. Whilst S4 formalises S3 to allow for a fully OO interface to be developed, its dependence on inheritance forces design decisions that quickly become problematic. This is seen in the large inheritance trees in distr in which one implemented distribution can be nested five child classes deep. This is compounded by the fact that S4 does not use pointer objects but instead nests objects internally. Therefore distr has problems with composite distributions in that they quickly become very large in size, for example a mixture of two distributions can easily be around 0.5Mb, which is relatively large. In contrast, R6 introduces pointers, which means that a wrapped object simply points to its wrapped component and does not copy it needlessly. Whilst a fully object-oriented interface can be developed in S3 and S4, they do not have the flexibility of R6, which means that in the long run, extensibility and scalability can be problematic. R6 forces R users to learn a paradigm that they may not be familiar with but packages like R62S3 allow users to become acquainted with R6 on a slightly shallower learning curve. Speed differences for the three paradigms are formally compared on the example above using microbenchmark (Mersmann, 2019), the results are in table 4. The R6 example is compared both including construction of the class, duck$new("Zaphod")$quack(), and without construction, d$quack(), where d is the object constructed before comparison. A significant 'bottleneck' is noted when construction is included in the comparison but despite this S4 is still significantly the slowest.

Design patterns
In the simplest definition, 'design patterns' are abstract solutions to common coding problems. They are probably most widely known due to the book 'Design Patterns Elements of Reusable Object-Oriented Software' (Design Patterns) (Gamma et al., 1996). distr6 primarily makes use of the following design patterns • Abstract Factory

• Strategy
Strategy The strategy pattern is common in modelling toolboxes, in which multiple algorithms can be used to solve a problem. This pattern defines an abstract class for a given problem and concrete classes that each implement different strategies, or algorithms, to solve the problem. For example in the context of mathematical integration (a common problem in R), one could use Simpson's rule, Kronrod's, or many others. These can be specified by an integrate abstract class with concrete sub-classes simpson and kronrod.
integrate simpson kronrod Composite The composite pattern defines a collection of classes that have an identical interface when treated independently or when composed into a single class with constituent parts. To the user, this means that only one interface needs to be learnt in order to interact with composite or individual classes. A well-built composite pattern allows users to construct complex classes with several layers of composition, and yet still be able to make use of a single interface. By inheriting from a parent class, each class and composite share a common interface.
Decorator Decorators add additional responsibilities to an object without making any other changes to the interface. An object that has been decorated will be identical to its un-decorated counter-part except with additional methods. This provides a useful alternative to inheritance. Whereas inheritance can lead to large tree structures in which each sub-class inherits from the previous and contains all previous methods, decorators allow the user to pick and choose with responsibilities to add. The figure below demonstrates how this is useful in a shopping cart example. The top of the figure demonstrates using inheritance, in which each sub-class adds methods to the Cart parent class. By the Tax child class there are a total of five methods in the interface. In the bottom of the figure, the decorator pattern demonstrates how the functionality for adding items and tax is separated and can be added separately.

Contributions to R6
In order to implement distr6, several contributions were made to the R6 paradigm, to extend its abilities and to implement the design patterns discussed above.
Abstract classes R6 did not have a concept of abstract classes, which meant that patterns such as adapters, composites, and decorators, could not be directly implemented without problems. This is produced in distr6 with the abstract function, which is placed in the first line of all abstract classes.
In the example below, obj expects the self argument from R6 classes, and class is the name of the class, getR6Class is a custom function for returning the name of the class of the given object.  When the $decorate method from a constructed decorator object is called, the methods are simply copied from the decorator environment to the object environment. The decorator() function simplifies this for the user.

Composite and wrappers
The composite pattern is made use of in what distr6 calls 'wrappers'. Again this is implemented via an abstract class (DistributionWrapper) with concrete sub-classes.

> DistributionWrapper <DistributionWrapper> object generator
Inherits from: <Distribution> Public: initialize: function (distlist = NULL, name, short_name, description, support, wrappedModels: function (model = NULL) setParameterValue: function (..., lst = NULL, error = "warn") Private: .wrappedModels: list > TruncatedDistribution <TruncatedDistribution> object generator Inherits from: <DistributionWrapper> Public: initialize: function (distribution, lower = NULL, upper = NULL) setParameterValue: function (..., lst = NULL, error = "warn") Private: .pdf: function (x, log = FALSE) .cdf: function (x, lower.tail = TRUE, log.p = FALSE) .quantile: function (p, lower.tail = TRUE, log.p = FALSE) .rand: function (n) Wrappers in distr6 alter objects by modifying either their public or private methods. Therefore an 'unwrapped' distribution looks identical to a 'wrapped' one, despite inheriting from different classes. This is possible via two key implementation strategies: 1) on construction of a wrapper, parameters are prefixed with a unique ID, meaning that all parameters can be accessed at any time; 2) the wrappedModels public field allows access to the original wrapped distributions. These two factors allow any new method to be called either by reference to wrappedModels or by using $getParameterValue with the newly prefixed parameter ID. This is demonstrated in the .pdf private method of the TruncatedDistribution wrapper (slightly abridged): .pdf = function (

return(pdf) }
As the public pdf is the same for all distributions, and this is inherited by wrappers, only the private .pdf method needs to be altered.

Examples
This final section looks at concrete short examples for four key use-cases.

Constructing and querying distributions
The primary use-case for the majority of users will be in constructing distributions in order to query their results and visualise their shape.
Below a distribution (Binomial) is constructed and queried for its distribution-specific traits and parameterisation-specific properties.
Specific methods from the distribution are queried as well.
The distribution is visualised by plotting it's density, distribution, inverse distribution, hazard, cumulative hazard, and survival function; the output is in figure 2.
> plot(b, fun = "all") Analysis of empirical data distr6 can also serve as a toolbox for analysis of empirical data by making use of the three 'empirical' distributions: Empirical, EmpricalMV, and WeightedDiscrete.
First an empirical distribution is constructed with samples from a standard exponential distribution.

> E <-Empirical$new(samples = rexp(10000))
The summary function is used to quickly obtain key information about the empirical distribution.

Quick Statistics
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859 The distribution is compared to a (standard) Normal distribution and then (standard) Exponential distribution; output in figure 3.

Learning from custom distributions
Whilst empirical distributions are useful when data samples have been generated, custom distributions can be used to build an entirely new probability distribution -though here we use a simple discrete uniform distribution. This example highlights the power of decorators to estimate distribution results without manual computation of every possible method. The output demonstrates the precision and accuracy of these results.
Below, a custom distribution is created and by including the decorators argument, all further methods are imputed numerically. The distribution is summarised for properties, traits and common results (this is possible with the 'CoreStatistics' decorator). The summary is identical to the analytic DiscreteUniform distribution.

Composite distribution modelling
Composite distributions are an essential part of any distribution software, the following example demonstrates two types of composites: composition via distribution transformation (truncation), and composition via mixtures and vectors.
First, a Binomial distribution is constructed and truncated between 1 and 5, the CDF of the new distribution is queried.
> V <-VectorDistribution$new(distribution = "Normal", + params = data.frame(mean = 1:2)) > V$parameters() id value support 1: Norm1_mean Vectorisation is possible across distributions, samples, and both. In the example below, the first call to $pdf evaluates both distributions at (1, 2), the second call evaluates the first at (1) and the second at (2), and the third call evaluates the first at (1, 2) and the second at (3, 4). Finally a mixture distribution with uniform weights is constructed from a Normal(2, 1) distribution and an Exponential(1).
> MD <-MixtureDistribution$new( + list(Normal$new(mean = 2, sd = 1), + Exponential$new(rate = 1) + ) + ) > MD$pdf(1:5)  Summary distr6 introduces a robust and scalable object-oriented interface for probability distributions to R. It officially upgrades the distr family of packages and aims to be the first-stop for object-oriented probability distributions in R. By making use of R6, every implemented distribution is clearly defined with properties, traits, and analytic results. Whilst R stats is limited to very basic dpqr functions for representing evaluated distributions, distr6 ensures that probability distributions are treated as complex mathematical objects.
Future updates of the package will include adding further numerical approximation strategies in the decorators to allow users to choose different methods (instead of being forced to use one). Additionally, the extensions to R6 could be abstracted into an independent package in order to better benefit the R community.
distr6 is released under the MIT licence on GitHub and CRAN. Extended documentation, tutorials, and examples are available at https://alan-turing-institute.github.io/distr6/. Code quality is monitored and maintained by an extensive suite of unit tests on multiple continuous integration systems.