Common Spatial Patterns (CSP) is a widely used method to analyse electroencephalography (EEG) data, concerning the supervised classification of the activity of brain. More generally, it can be useful to distinguish between multivariate signals recorded during a time span for two different classes. CSP is based on the simultaneous diagonalization of the average covariance matrices of signals from both classes and it allows the data to be projected into a low-dimensional subspace. Once the data are represented in a low-dimensional subspace, a classification step must be carried out. The original CSP method is based on the Euclidean distance between signals, and here we extend it so that it can be applied on any appropriate distance for data at hand. Both the classical CSP and the new Distance-Based CSP (DB-CSP) are implemented in an R package, called dbcsp.
Eigenvalue and generalized eigenvalue problems are very relevant
techniques in data analysis. The well-known Principal Component Analysis
with the eigenvalue problem in its roots was already established by the
late seventies (Mardia et al. 1979). In mathematical terms, Common Spatial
Patterns (CSP) is based on the generalized eigenvalue decomposition or
the simultaneous diagonalization of two matrices to find projections in
a low dimensional space. Although in algebraic terms PCA and CSP share
several similarities, their main aims are different: PCA follows a
non-supervised approach but CSP is a two-class supervised technique.
Besides, PCA is suitable for standard quantitative data arranged in
The original CSP method is based on the Euclidean distance between signals. However, as far as we know, a generalization allowing the use of any appropriate distance was not introduced. The aim of the present work is to introduce a novel Distance-Based generalization of it (DB-CSP). This generalization is of great interest, since these techniques can also offer good solutions in other fields where multivariate time series data arise beyond pure electroencephalography data (Poppe 2010; Rodríguez-Moreno et al. 2020).
Although CSP in its classical version is a very well-known technique in the field of BCI, it is not implemented in R. In addition, as DB-CSP is a new extension of it, it is worth building an R package that includes both CSP and DB-CSP techniques. The package offers functions in a user-friendly way for the less familiar users of R but it also offers complete information about its objects so that reproducible analysis can be carried out and more advanced and customised analysis can be performed taking advantage of already well-known packages of R.
The paper is organized as follows. First, we review the mathematical formulation of the Common Spatial Patterns method. Next, we present the core of our contribution describing both the novel CSP’ extension based on distances and the dbcsp package. Then, the main functions in dbcsp are introduced along with reproducible examples of their use. Finally, some conclusions are drawn.
Let us consider that we have
The goal is to classify a new unit
The main idea is to use a linear transform to project or filter data
into low-dimensional subspace with a projection matrix, in such a way
that each row consists of weights for signals. This transformation
maximizes the variance of two-class signal matrices. The method performs
a simultaneous diagonalization of the covariance matrices of both
classes. Given data
All matrices are standardized so that traces of
Compute average covariance matrices:
Look for directions
The solution is given by the generalized spectral decomposition
Vectors
Finally, the log-variability of these new and few
Following the commented ideas, the Distance-Based CSP (DB-CSP) is an
extension of the classical CSP method. In the same way as the classical
CSP, DB-CSP gives some weights to the original sources or signals and
obtains new and few
Once we have the covariance matrices related to the chosen distance
matrix, the directions are found as in classical CSP and new signals
It is important to note that if the chosen distance does not produce a positive definite covariance matrix, it must be replaced by a similar one that is positive definite.
When the selected distance is the Euclidean, then, DB-CSP reduces to classical CSP.
Once the
In this section, the structure of the package and the functions implemented are explained. The dbcsp package was developed for the free statistical R environment and it is available from the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/web/packages/dbcsp/index.html.
The input data are the corresponding x1
and x2
be two lists of
length NA
values are allowed. They
are imputed by interpolating with the surrounding values via the
na.approx
function in package
zoo. To ensure the user is
aware of the missing values and their imputation, a warning is printed.
We also consider that new items to be classified are in list xt
. The
aforementioned first step of the method is carried out by building the
object called "dbcsp"
.
dbcsp
objectThe dbcsp
object is an S4 class created to compute the projection
vectors
Slots
X1 = "list"
, X2 = "list"
, the lists X1
and X2
(lengths
q = "integer"
, to determine the number of pairs of eigenvectors
q=15
.
labels = "character"
, vector of two strings indicating labels
names, by default names of elements in X1
and X2
.
type = "character"
, to set the type of distance to be considered,
by default type=’EUCL’
. The supported distances are these ones:
infnorm, ccor, sts,...
bhattacharyya, bray,...
dcustom
which returns a scalar
providing the distance value
(type="dcustom"
). The parallelDist package also
allows the use of custom distances, but the distance function
has to be defined using the cppXPtr
function of the
RcppXPtrUtils
package, as is explained in the User-defined distance
functions section of the parallelDist package
documentation.mixture = "logical"
, logical value indicating whether to use
mixture of distances or not (EUCL + other), by default
mixture=FALSE
.
w = "numeric"
, weight for the mixture of distances
w=0.5
.
training = "logical"
, logical value indicating whether or not to
perform the classification, by default classification=FALSE
. If
classification=TRUE
, LDA discrimination based on the log-variances
of the projected sources is considered, following the classical
approach in CSP.
fold = "integer"
, integer value, by default fold=10
. It controls
the number of partitions for the
seed = "numeric"
, numeric value, by default seed=NULL
. Set a
seed in case you want to be able to replicate the results.
eig.tol = "numeric"
, numeric value, by default eig.tol=1e-06
. If
the minimum eigenvalue is below this tolerance, average covariance
matrices are replaced by the most similar matrix that is positive
definite. It is done via function nearPD
in
Matrix and a warning
message is printed to make the user aware of it.
out = "list"
, list containing elements of the output. Mainly,
matrix vectors
, log-variances of filtered signals in proy
and
partitions considered in the
Usage
Following the standard procedure in R
, an instance of a class
dbcsp
is created via the new()
constructor function:
new("dbcsp", X1 = x1, X1 = x2)
Slots X1
and X2
are compulsory since they contain the original
data. When a slot is not specified, the default value is considered.
First, the S4 object of class dbcsp
must be created. By default,
the Euclidean distance is used, nevertheless it can be changed. For
instance, "Dynamic Transform Distance" (Giorgino et al. 2009) can be set:
mydbcsp <- new('dbcsp', X1=x1, X2=x2, type='dtw')
```
or a mixture between this distance and the Euclidean can be
indicated by:
``` r
mydbcsp.mix <- new('dbcsp', X1=x1, X2=x2, labels=c("C1", "C2"),
mixture=TRUE, w=0.4,type="dtw")
Besides, a custom distance function can be defined and used when creating the object:
fn <- function(x, y) mean(1 - cos(x - y))
mydbcsp <- new("dbcsp", X1 = x, X2 = y, type="fn")
It is worth mentioning that it is possible to reduce the
computational time through parallelDist
custom distance option,
where the function is defined using C++ and by creating an external
pointer to the function by means of the cppXPtr
function:
customEucli <- RcppXPtrUtils::cppXPtr(
"double customDist(const arma::mat &A, const arma::mat &B) {
return sqrt(arma::accu(arma::square(A - B)));
}",
depends = c("RcppArmadillo")
)
mydbcsp <- new('dbcsp',x1,x2,type="customEucli")
The object contains all the information to carry out the classification task in a lower dimension space.
plot
and boxplot
For exploratory and descriptive purposes, the original signals
plot(mydbcsp)
x
, an object of class dbcsp
class
, integer to indicate which of both classes to access (1 or
2), by default class=1
.index
, integer to indicate which instance of the class to plot,
by default index=1
.vectors
, integer to indicate which pairs
logical, if TRUE
the pairs pairs=TRUE
.before
logical, if TRUE
the original signals are plotted, by
default before=TRUE
.after
logical, if TRUE
the signals after projection are
plotted, by default after=TRUE
.legend
logical, if TRUE
, a legend for filtered signals is
shown, by default legend=FALSE
.getsignals
logical, if TRUE
, the projected signals are returned.Besides, the log-variances of the projected signals of both classes can be shown in boxplots. This graphic can help to understand the discriminative power that is in the low-dimension space.
boxplot(mydbcsp)
x
, an object of class dbcsp
vectors
, integer or vector of integers, indicating the index of
the projected vectors to plot, by default index=1
.pairs
logical, if TRUE
the pairs pairs=TRUE
.show_log
logical, if TRUE
the logarithms of the variances are
displayed, otherwise the variances, by default show_log=TRUE
.It is worth taking into account that in the aforementioned functions,
values in argument vectors
must lie between 1 and dbcsp
object. Therefore, values 1 to pairs=TRUE
, it is
recommended that values in argument vectors
are in q=15
and
boxplot(object, vectors=16, pairs=FALSE)
, vector
selectQ
, Function train
and Function predict
The functions in this section help the classification step in the
procedure. Function selectQ
helps to find an appropriate dimension
needed for the classification. Given different values of dimensions, the
accuracy related to each dimension is calculated so that the user can
assess which dimension of the reduced space can be sufficient. A
train
performs the Linear Discriminant
classification based on the log-variances of the dimensions built in the
dbcsp
object. Since LDA has a geometric interpretation that makes the
classifier sensible for more general situations (Duda et al. 2001), not the
normality nor the homoscedasticity of data are checked. The accuracy of
the classifier is computed based on the predict
performs the classification of new
individuals.
selectQ
selectQ(mydbcsp)
object
, an object of class dbcsp
Q
, vector of integers which represents the dimensions to use, by
default Q=c(1,2,3,5,10,15)
.train_size
, float between 0.0 and 1.0 representing the proportion
of the data set to include in the train split, by default
train_size=0.75
.CV
, logical indicating whether a TRUE
, train_size
is not
used), by default CV=FALSE
.folds
integer, number of folds to use if CV
is performed.seed
numeric value, by default seed=NULL
. Set a seed in case you
want to be able to replicate the results.This function returns the accuracy values related to each dimension set
in Q
. If CV=TRUE
, the mean accuracy as well as the standard
deviation among folds is also returned.
Usage of train
train(mydbcsp)
or embedded as a parameter in:
new(’dbcsp’, X1=x1, X2=x2, training=TRUE, type="dtw")
Arguments
x
, an object of class dbcsp
selected_q
, integer value indicating the number of vectors to use
when training the model. By default all dimensions considered when
creating the object dbcsp
.
Besides, arguments seed
and fold
are available.
It is important to note that in this way a classical analysis can be carried out, in the sense of:
select_q
;However, it is evident that it may be of interest to use other
classifiers or other characteristics in addition to or different from
log-variances. This more advanced procedure is explained below. See the
basic analysis of the User guide with a real example section in
order to visualize and follow the process of a first basic/classic
analysis.
predict
predict(mydbcsp, X_test=xt)
object
, an object of class dbcsp
X_test
, list of matrices to be classified.true_targets
, optional, if available, vector of true labels of the
instances. Note that they must match the name of the labels used
when training the model.To show an example beyond pure electroencephalography data, Action
Recognition data is considered. Besides having a reproducible example to
show the use of the implemented functions and the results they offer,
this Action Recognition data set is included in the package. The data
set contains the skeleton data extracted from videos of people
performing six different actions, recorded by a semi-humanoid robot. It
consists of a total of 272 videos with 6 action categories. There are
around 45 clips in each category, performed by 46 different people. Each
instance is composed of 50 signals (
Come: gesture for telling the robot to come to you. There are 46 instances for this class.
Five: gesture of ‘high five’. There are 45 instances for this class.
Handshake: gesture of handshaking with the robot. There are 45 instances for this class.
Hello: gesture for saying hello to the robot. There are 44 instances for this class.
Ignore: ignore the robot, pass by. There are 46 instances for this class.
Look at: stare at the robot in front of it. There are 46 instances for this class.
The data set is accessible via AR.data
and more specific information
can be found in (Rodrı́guez-Moreno et al. 2020). Each class is a list of
matrices of
For example, two different classes can be accessed this way:
x1 <- AR.data$come
x2 <- AR.data$five
where, x1
is a list of 46 instances of x2
is a list of 45 instances of
Next, the use of functions in dbcsp
is shown based on this data set.
First a basic/classic analysis is performed.
Let us consider an analysis using 15-dimensional projections and the
Euclidean distance. At a first step the user can obtain vectors
x1 <- AR.data$come
x2 <- AR.data$five
mydbcsp <- new('dbcsp', X1=x1, X2=x2, q=15, labels=c("C1", "C2"))
summary(mydbcsp)
Creating the object mydbcsp
, the vectors q=15
, the first and last 15 eigenvectors are
retained. With summary
, the obtained output is:
There are 46 instances of class C1 with [50x92] dimension.
There are 45 instances of class C2 with [50x92] dimension.
The DB-CSP method has used 15 vectors for the projection.
EUCL distance has been used.
Training has not been performed yet.
Now, if the user knows from the beginning that 3 is an appropriate dimension, the classification step could be done while creating the object. Using classical analysis, with for instance 10-fold, LDA as classifier and log-variances as characteristics, the corresponding input and summary output are:
mydbcsp <- new('dbcsp', X1=x1, X2=x2, q=3, labels=c("C1", "C2"), training=TRUE, fold = 10, seed = 19)
summary(mydbcsp)
There are 46 instances of class C1 with [50x92] dimension.
There are 45 instances of class C2 with [50x92] dimension.
The DB-CSP method has used 3 vectors for the projection.
EUCL distance has been used.
An accuracy of 0.9130556 has been obtained with 10 fold cross validation and using 3 vectors when training.
If a closer view of the accuracies among the folds is needed, the user
can obtain them from the out
slot of the object:
# Accuracy in each fold
mydbcsp@out$folds_acc
# Intances belonging to each fold
mydbcsp@out$used_folds
Furthermore, it is clear that the optimal value of selectQ
function:
mydbcsp <- new('dbcsp', X1=x1, X2=x2, labels=c("C1", "C2"))
selectDim <- selectQ(mydbcsp, seed=30, CV=TRUE, fold = 10)
selectDim
Q acc sd
1 1 0.7663889 0.12607868
2 2 0.9033333 0.09428818
3 3 0.8686111 0.11314534
4 5 0.8750000 0.13289537
5 10 0.8797222 0.09513230
6 15 0.8250000 0.05257433
Since the
To visualize what is the representation in the reduced dimension space
function plot
can be used. For instance, to visualize the first unit
of the first class, based on projections along the 2 first and last
vectors (
plot(mydbcsp, index=1, class=1, vectors=1:2)
In the top graphic of Figure 3, the representation of the
first video of class
To have a better insight of the discriminating power of the new signals
in the reduced dimension space, we can plot the corresponding
log-variances of the new signals. Parameter vectors
in function
boxplot
sets which are the eigenvectors considered to plot.
boxplot(mydbcsp, vectors=1:2)
In Figure 4 it can be seen that variability of
projections on the first eigenvector direction
(x1
, but
small for elements in x2
. Analogously, projecting on the last
dimension (x1
and big variability in x2
. The same pattern holds when
projecting on vectors
Once the value of train()
) so that the user can proceed to predict the class a new
action held in a video belongs to, using the function predict
. For
instance, with only illustrative purpose, we can classify the first 5
videos which are stored in x1
.
mydbcsp <- train(mydbcsp, selected_q=2, verbose=FALSE)
xtest <- x1[1:5]
outpred <- predict(mydbcsp, X_test=xtest)
If the labels of the testing items are known, the latter function returns the accuracy.
outpred <- predict(mydbcsp, X_test=xtest, true_targets= rep("C1", 5))
Finally, notice that the user could use any other distance instead of
the Euclidean between the signals to compute the important directions
type="dtw"
:
# Distance DTW
mydbcsp.dtw <- new('dbcsp', X1=x1, X2=x2, labels=c("C1", "C2"), type="dtw")
In the previous section a basic workflow to use functions implemented in
dbcsp is presented. Nevertheless, it is straightforward to extend the
procedure. Once the interesting directions in dbcsp
, other summarizing characteristics beyond the variance could be
extracted from the projected signals, as well as other classifiers which
could be used in the classification step. For those purposes, dbcsp
is
used to compute the directions in dbcsp
, several characteristics could be extracted from the signals and
a new data.frame
can be built so that any other classification
technique could be applied. In this example we worked with
caret package to apply
different classifiers. It is important to pay attention to which the
train and test sets are, so that the vectors are computed based only on
training set instances.
# Establish training and test data
n1 <- length(x1)
trainind1 <- rep(TRUE, n1)
n2 <- length(x2)
trainind2 <- rep(TRUE, n2)
set.seed(19)
trainind1[sample(1:n1, 10, replace=FALSE)] <- FALSE
trainind2[sample(1:n2, 10, replace=FALSE)] <- FALSE
x1train <- x1[trainind1]
x2train <- x2[trainind2]
# Extract the interesting directions
vectors <- new('dbcsp', X1=x1train, X2=x2train, q=5, labels=c("C1", "C2"))@out$vectors
# Function to calculate the desired characteristics from signals
calc_info <- function(proj_X, type){
values <- switch(type,
'var' = values <- plyr::laply(proj_X, function(x){apply(x,1,var)}),
'max' = values <- plyr::laply(proj_X, function(x){apply(x,1,max)}),
'min' = values <- plyr::laply(proj_X, function(x){apply(x,1,min)}),
'iqr' = values <- plyr::laply(proj_X, function(x){
apply(x,1,function(y){
q <- quantile(y, probs = c(0.25, 0.75))
q[2] -q[1]
})
})
)
return(values)
}
By means of this latter function, besides the variance of the new signals, the maximum, the minimum, and the interquartile range can be extracted.
Next, imagine we want to perform our classification step with the interquartile range information along with the log-variance.
# Project units of class C1 and
projected_x1 <- plyr::llply(x1, function(x,W) t(W)%*%x, W=vectors)
# Extract the characteristics
logvar_x1 <- log(calc_info(projected_x1,'var'))
iqr_x1 <- calc_info(projected_x1,'iqr')
new_x1 <- data.frame(logvar=logvar_x1, iqr=iqr_x1)
# Similarly for units of class C2
projected_x2 <- plyr::llply(x2, function(x,W) t(W)%*%x, W=vectors)
logvar_x2 <- log(calc_info(projected_x2,'var'))
iqr_x2 <- calc_info(projected_x2,'iqr')
new_x2 <- data.frame(logvar=logvar_x2, iqr=iqr_x2)
# Create dataset for classification
labels <- rep(c('C1','C2'), times=c(n1,n2))
new_data <- rbind(new_x1,new_x2)
new_data$label <- factor(labels)
new_data_train <- new_data[c(trainind1, trainind2), ]
new_data_test <- new_data[!c(trainind1, trainind2), ]
# Random forest
trControl <- caret::trainControl(method = "none")
rf_default <- caret::train(label~.,
data = new_data_train,
method = "rf",
metric = "Accuracy",
trControl = trControl)
rf_default
# K-NN
knn_default <- caret::train(label~.,
data = new_data_train,
method = "knn",
metric = "Accuracy",
trControl = trControl)
knn_default
# Predictions and accuracies on test data
# Based on random forest classifier
pred_labels <- predict(rf_default, new_data_test)
predictions_rf <- caret::confusionMatrix(table(pred_labels,new_data_test$label))
predictions_rf
# Based on knn classifier
pred_labels <- predict(knn_default, new_data_test)
predictions_knn <- caret::confusionMatrix(table(pred_labels,new_data_test$label))
predictions_knn
Thus, it is easy to integrate results and objects that dbcsp builds so that they can be integrated with other R packages and functions. This is interesting for more advanced users to perform their own customized analysis.
In this work a new Distance-Based Common Spatial Pattern is introduced. It allows to perform the classical Common Spatial Pattern when the Euclidean distance between signals is considered, but it can be extended to the use of any other appropriate distance between signals as well. All of it is included in package the dbcsp. The package is easy to use for non-specialised users but, for the sake of flexibility, more advanced analysis can be carried out combining the created object and obtained results with already well-known R packages, such as caret, for instance.
This research was partially supported: IR by The Spanish Ministry of Science, Innovation and Universities (FPU18/04737 predoctoral grant). II by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00; PID2019-106942RB-C31). CA by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00, RTI2018-100968-B-I00) and by Grant 2017SGR622 (GRBIO) from the Departament d’Economia i Coneixement de la Generalitat de Catalunya. BS II by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00).
II and CA designed the study. IR and II wrote and debugged the software. IR, II and CA checked the software. II, CA, IR and BS wrote and reviewed the manuscript. All authors have read and approved the final manuscript.
zoo, TSdist, parallelDist, RcppXPtrUtils, Matrix, caret
Econometrics, Environmetrics, Finance, HighPerformanceComputing, MachineLearning, MissingData, NumericalMathematics, TimeSeries
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Rodríguez, et al., "dbcsp: User-friendly R package for Distance-Based Common Spatial Patterns", The R Journal, 2022
BibTeX citation
@article{RJ-2022-044, author = {Rodríguez, Itsaso and Irigoien, Itziar and Sierra, Basilio and Arenas, Concepción}, title = {dbcsp: User-friendly R package for Distance-Based Common Spatial Patterns}, journal = {The R Journal}, year = {2022}, note = {https://rjournal.github.io/}, volume = {14}, issue = {3}, issn = {2073-4859}, pages = {80-94} }