mmpp: A Package for Calculating Similarity and Distance Metrics for Simple and Marked Temporal Point Processes

A simple temporal point process (SPP) is an important class of time series, where the sample realization of the process is solely composed of the times at which events occur. Particular examples of point process data are neuronal spike patterns or spike trains, and a large number of distance and similarity metrics for those data have been proposed. A marked point process (MPP) is an extension of a simple temporal point process, in which a certain vector valued mark is associated with each of the temporal points in the SPP. Analyses of MPPs are of practical importance because instances of MPPs include recordings of natural disasters such as earthquakes and tornadoes. In this paper, we introduce the R package mmpp, which implements a number of distance and similarity metrics for SPPs, and also extends those metrics for dealing with MPPs.


Introduction
A random point process is a mathematical model for describing a series of discrete events (Snyder and Miller, 1991).Let X = {t; t 0 ≤ t ≤ t 0 + T} be the base space, on which an event occurs.The base space can be quite abstract, but here we will take X to be a semi-infinite real line representing time.A set of ordered points on X is denoted as x = {x 1 , x 2 , . . ., x n }, x i ≤ x i+1 , and called a sample realization, or simply realization of a point process.
Reflecting the importance of the analysis of point processes in a broad range of science and engineering problems, there are already some R packages for modeling and simulating point processes such as splancs (Rowlingson and Diggle, 1993), spatstat (Baddeley and Turner, 2005), PtProcess (Harte, 2010), and stpp (Gabriel et al., 2013).These packages support various approaches for the analysis of both simple and marked spatial or spatio-temporal point processes, namely, estimating an intensity function for sample points, visualizing the observed sample process, and running simulations based on the specified models.
To complement the above mentioned packages, in mmpp (Hino et al., 2015), we focus on the similarity or distance metrics between realizations of point processes.Similarity and distance metrics are fundamental notions for multivariate analysis, machine learning and pattern recognition.For example, with an appropriate distance metric, a simple k nearest neighbor classifier and regressor (Cover and Hart, 1967) works in a satisfactory way.Also, kernel methods (Shawe-Taylor and Cristianini, 2004) are a well known and widely used framework in machine learning, in which inferences are done solely based on the values of kernel function, which is considered as a similarity metric between two objects.
As for the distance and similarity metric for point processes, vast amount of methods are developed in the field of neuroscience (Kandel et al., 2000).In this field, neural activities are recorded as sequences of spikes (called spike trains), which is nothing but a realization of a simple point process (SPP).By its nature, the responses of neurons to the same stimulus can be different.To claim the repeatability and the reliability of experimental results, a number of different distance and similarity metrics between sequence of spikes are developed (Victor, 2005;Schreiber et al., 2003;Kreuz et al., 2007;Quian Quiroga et al., 2002;van Rossum, 2001).
The package mmpp categorizes commonly used metrics for spike trains and offers implementations for them.Since a spike train is a realization of a simple point process, the original metrics developed in the field of neuroscience do not consider marked point process (MPP) realizations.mmpp extends conventional metrics for SPPs to MPPs.We have two main aims in the development of mmpp: 1. to have a systematic and unified platform for calculating the similarities and distances between SPPs, and 2. to support MPPs to offer a platform for performing metric-based analysis of earthquakes, tornados, epidemics, or stock exchange data. The

Distances and similarities for point processes
Since realizations of temporal SPPs are ordered sets of the events, the commonly used Euclidean distance and inner product cannot be directly defined between them.Most of the metrics for SPP realizations first transform the realizations x = {x 1 , . . ., x n }, x i ≤ x i+1 and y = {y 1 , . . ., y m }, ; y j ≤ y j+1 to continuous functions v x (t) and v y (t), then define the distance or similarity metric between them.Based on the transformations, we categorize conventional methods for defining metrics on SPP realizations, and explain one by one in the following subsections.
We note that there are some attempts to directly define distances between SPP realizations.One of the most principled and widely used methods is based on the edit distance (Victor, 2005), and this method is extended to deal with MPP realizations by Suzuki et al. (2010).However, this approach is computationally expensive and prohibitive for computing the distance between spike trains with even a few dozen spikes.We exclude this class of metric from the current version of mmpp.
In the following, we often use kernel smoothers and step functions for transforming SPP realizations.For notational convenience, we denote a kernel smoother with parameter τ by h τ (t), and the Heaviside step function by Examples of smoothers include the Gaussian kernel smoother h g τ (t) = exp(−t 2 /(2τ 2 ))/ √ 2πτ 2 and the Laplacian kernel smoother h l τ (t) = exp(−|t|/τ)/(2τ).

Filtering to a continuous function
The most commonly used and intensively studied metrics for spike trains are based on the mapping of an event sequence to a real valued continuous function as Figure 1 illustrates the transformation of an event sequence to a continuous function by the smoothing method.
Then, the inner product of x and y is defined by the usual 2 inner product in functional space by and similarly the distance is defined by When we use the Laplacian smoother h l τ (t) = exp (−|t|/τ) /(2τ), the similarity and distance are The R Journal Vol.7/2, December 2015 ISSN 2073-4859 analytically given as and respectively.This distance eq. ( 6) is adopted by van Rossum (2001) for measuring the distance between spike trains.On the other hand, Schreiber et al. (2003) proposed to use the correlation defined by to measure the similarity between spike trains.This class of measures is extended to take into account the effect of burst, i.e., the short interval in which events occur in high frequency, and refractory period, i.e., the short interval in which events tend to be suppressed immediately after the previous events (Houghton, 2009;Lyttle and Fellous, 2011).These two effects, namely burst and refractory period, are commonly observed in neural activities.They are also observed in earthquake catalogues.
After a large main shock, usually we observe high frequent aftershocks.On the other hand, suppression of events is sometimes happening, possibly because after a big event, the coda is so large that one cannot detect smaller events under the large ongoing signal from the big event (Kagan, 2004;Iwata, 2008).
The filtering-based metric is computed by using the function fmetric in mmpp.The first two arguments S1 and S2 are the (marked) point process realizations in the form of a 'matrix' object.The first column of S1 and S2 are the event timings and the rest are the marks.The argument measure can be either "sim" or "dist", indicating similarity or distance, respectively.By default, the function assumes the Laplacian smoother.When the argument h of function fmetric is set to a function with scaling parameter τ as > fmetric(S1, S2, measure = "sim", h = function(x, tau) exp(-x^2/tau), tau = 1) the integrals in eq.(3) and eq.( 4) are numerically done using the R function integrate.The function h should be square integrable and non-negative.

Intensity inner products
For analysis of point processes, the intensity function plays a central roll.Paiva et al. (2009) proposed to use the intensity function for defining the inner product between SPP realizations.Let N(t) be the number of events observed in the interval (0, t].The intensity function of a counting process N(t) is defined by We note that we can also consider the conditional intensity function reflecting the history up to the current time t, but we only explain the simplest case in this paper.Assuming that the SPP to be analysed is well approximated by a Poisson process, the intensity function is estimated by using a smoother h τ (t) as in non-parametric manner (Reiss, 1993).Using the estimates of intensity functions for processes behind x and y, Paiva et al. ( 2009) defined a similarity metric by Particularly, when we use a Gaussian smoother h g The R Journal Vol.7/2, December 2015 ISSN 2073-4859 analytically computed and we obtain an explicit formula The distance metric is also defined as and it is also simplified when we use the Gaussian smoother.
This class of measures is most in alignment with the statistical model of point processes.We estimated the intensity function in a versatile non-parametric approach, but it is reasonable to use other models such as the Hawkes model (Hawkes, 1971) when we should include the self-exciting nature of the process.
The intensity inner product metric is computed by using the function iipmetric in mmpp.In the current version, the function assumes the Gaussian smoother, and its scaling parameter is specified by the argument tau as > iipmetric(S1, S2, measure = "sim", tau = 1)

Co-occurrence metric
For comparing two SPP realizations, it is natural to count the number of events which can be considered to be co-occurring.There are two metrics for SPP realizations based on the notion of co-occurrence.
The first one proposed by Quian Quiroga et al. (2002) directly counts near-by events.The closeness of two events are defined by adaptively computed thresholds, making the method free from tuning parameters.Suppose we have two SPP realizations x = {x 1 , . . ., x n } and y = {y 1 , . . ., y m }.For any events x i ∈ x and y j ∈ y, a threshold under which the two events are considered to be synchronous with each other is defined as half of the minimum of the four inter event intervals around these two events: We note that τ ij in the above definition depends on x and y, though, for the sake of notational simplicity, we simply denote it by τ ij .Then, the function that counts the number of events in y which is coincided with those in x is defined by Using this counting function, a similarity measure between x and y is defined as and a distance measure is obtained by the transformation (Lyttle and Fellous, 2011): The second metric based on the counting co-occurrence is proposed by Hunter and Milton (2003), which transforms x to a continuous function v x (t), and sums up the near-by events in proportion to their degree of closeness.Denoting the closest event time in y from an event x i ∈ x by y (x i ) , we define a function which measures degree of closeness by Then, a similarity metric between x and y is defined by The R Journal Vol.7/2, December 2015 ISSN 2073-4859 and the distance is naturally defined by The co-occurrence based metrics are computed by using the function coocmetric.By default, the function assumes the counting similarity in eq. ( 16).The smoothed counting similarity is computed by specifying the argument type = "smooth" as > coocmetric(S1, S2, measure = "sim", type = "smooth", tau = 1)

Inter event interval
Assume an SPP realization x = {x 1 , . . ., x n }, x n < T such that for every event time x i , 0 < x i < T, where T is the horizon of the time interval.In the inter event interval proposed by Kreuz et al. (2007), the SPP realization x is first modified to include artificial events corresponding to the beginning and end of the interval as Then each event is mapped to a function v x (t) as Two SPP realizations x and y are transformed to v x (t) and v y (t), then they are used to define an intermediate function This function takes value 1 when x is identical to y, and takes a smaller value when x and y are highly dissimilar.By using this intermediate function, the similarity measure is defined by and the distance is defined by which is originally defined in Kreuz et al. (2007).A simple example of transformation x ⇒ v x (t), y ⇒ v y (t) and x, y ⇒ 1 − I xy (t) is illustrated in Figure 2.

Extension to marked point processes
Sometimes events considered in point processes entail certain vector valued marks.For example, seismic events are characterized by the time point the earthquake happens, and a set of attributes such as magnitude, depth, longitude, and latitude of the hypo-center.To deal with marked point processes, we extend the base space X to X = {t; t 0 ≤ t ≤ t 0 + T} × R p , the Cartesian product of the time interval [t 0 , t 0 + T] and a region of the p dimensional Euclidean space corresponding to marks.Realizations of MPPs are denoted by, for example, x = {(x 1 , r 1 ), . . ., (x n , r n )} and y = {(y 1 , s 1 ), . . ., (y m , s m )}.
There might be many possible ways of extension.The packages mmpp takes the simplest way to deal with marks in a unified and computationally efficient manner, namely, the density or weight of marks are included in the metrics for SPPs by Gaussian windowing as shown in eq. ( 27).

Filtering to continuous function
In the same manner as eq.( 2), the marked point process realization x = {(x 1 , r 1 ), . . ., (x n , r n )} is transformed to a continuous function as The R Journal Vol.7/2, December 2015 ISSN 2073-4859 where where |M| is the determinant of a matrix M, and z 2 M = z Mz.Integrating with respect to both time t and mark z, we define the inner product by and the distance by By virtue of Gaussian the integral with respect to the mark is explicitly written as Furthermore, when we use Laplacian smoother for transforming temporal SPPs, we obtain The distance metric is also calculated in the same manner.
We note that the effect of marks depend on the units used for the various marks.It is reasonable to estimate the variance of each mark, and set the diagonal elements of M to be reciprocal of the variances, which is adopted as the default setting for M in mmpp.

Intensity inner product
Extending kernel estimation eq. ( 9) to multivariate kernel estimation as we obtain the estimate of the intensity function of the marked point process.We note that kernel density estimation for multidimensional variables is inaccurate in general, and we can instead estimate The R Journal Vol.7/2, December 2015 ISSN 2073-4859 the ground intensity function λ(t) and the density function for mark λ(z) separately.However, in many applications, the dimension of marks is not so high, and currently we adopt the kernel based estimator in eq. ( 32).The intensity inner product for MPP realizations is then defined by When we use the Gaussian smoother h g τ (t) = exp −t 2 /τ / √ 2πτ 2 , the integral is explicitly computed and we obtain The distance metric is also defined by For estimating the intensity function, a simple Poisson process is assumed.This assumption is relaxed with more flexible models such as the ETAS model (Ogata, 1988(Ogata, , 1998)), where the intensity function is estimated using the R packages SAPP (of Statistical Mathematics, 2014) and etasFLP (Chiodi and Adelfio, 2015), for example.The extension of the intensity-based metric to support other forms of intensity estimation such as the Hawkes and ETAS models remains our important future work.

Co-occurrence metric
To extend the co-occurrence metric based on counting the synchronous events, eq. ( 15) is replaced with a weighted counter To extend the co-occurrence metric based on the smoothed count of the synchronous events, eq. ( 18) is replaced with a weighted smoothed counter

Inter event interval
To weight the inter event interval by using marks associated with two MPP realizations x and y, we define index extraction operators as follows.We modify an MPP realization x = {(x 1 , r 1 ), . . ., (x n , r n )} to include artificial events and marks corresponding to the beginning and end of the interval as Then we define operators The intermediate function eq.( 23) is modified to take into account the dissimilarity of marks: The distance and similarity are then calculated using eq.( 25) and eq.( 24).
The usage of the functions fmetric, iipmetric, coocmetric, and ieimetric does not change for marked point process data, except for one additional argument M, which is the precision matrix M in The R Journal Vol.7/2, December 2015 ISSN 2073-4859 eq. ( 27).By default, it is automatically set to the diagonal matrix with the diagonal elements equal to the reciprocal of the variance of corresponding marks of S1 and S2.We can also specify the matrix M manually as > fmetric(S1, S2, measure = "sim", M = diag(3)) where the number of marks is assumed to be three.

An example with the Miyagi20030626 data set
This section illustrates the use of the package with a simple experiment.We use the Miyagi20030626 dataset contained within the package.

> library(mmpp) > data(Miyagi20030626)
The dataset is composed of 2305 aftershocks of the 26th July 2003 earthquake of M6.2 at the northern Miyagi-Ken Japan, which is a reparameterization of the main2006JUL26 dataset from the SAPP package.Each record has 5 dimensions, time, longitude, latitude, depth, and magnitude of its hypo-center.
The time is recorded in seconds from the main shock.
To illustrate the use of the package, we consider a simple prediction task.We first split the original dataset by a time-window of length 60 × 60 × 3, which means that the time interval of each partial point process split by this window is three hours.
> ## target of prediction is the maximum magnitude in the window > m <-NULL > for (i in 1:length(sMiyagi)) { + m <-c (m, max(sMiyagi[[i]]$magnitude)) + } The task we consider is the prediction of the maximum magnitude in the next three hours using the past one week of data.We formulate this problem as a regression problem.Let the partial point process realization in the i-th window be o i , and let the maximum magnitude in the i + 1-th window be m i .Then the problem is predicting m i+1 given o i+1 and the past 24

Figure 1 :
Figure 1: An example of continuation by smoothing.Left: event timings are marked with ×.Right: the corresponding continuous function v x (t).

Figure 2 :
Figure 2: Example of the transformation of two point process realizations x and y into the intermediate function I xy .The top panel shows x and corresponding continuous function v x (t).The middle panel shows y and v y (t).The bottom panel shows the intermediate function 1 − I xy (t).
× 7/3 = 56 hours of data {(o i− , m i− )} 55 =0 .See Figure 3 for an illustrative diagram of the problem setting.> m <-m[-1] > sMiyagi[[length(sMiyagi)]] <-NULL > ## number of whole partial MPPs split by a 3-hour time window > N <-length(sMiyagi) > ## training samples are past one week data > Ntr <-24*7/3 > ## number of different prediction methods > Nd <-10 The R Journal Vol.7/2, December 2015 ISSN 2073-4859 An illustrative diagram of the problem setting.The horizontal axis corresponds to time, and the vertical axis shows marks.Though the dimension of the mark is four, it is shown as onedimensional axis.The process is split by a three hour window, and each window is assigned an output variable, which is the maximum magnitude in the next time window.Using the past one week data, the output variable of the next time window is predicted by nearest neighbor regression.