The Complex Multivariate Gaussian Distribution

Abstract:

Here I introduce package cmvnorm, a complex generalization of the mvtnorm package. A complex generalization of the Gaussian process is suggested and numerical results presented using the package. An application in the context of approximating the Weierstrass σ-function using a complex Gaussian process is given.

Cite PDF Tweet

Published

April 23, 2015

Received

Sep 18, 2014

Citation

Hankin, 2015

Volume

Pages

7/1

73 - 80


1 Introduction

Complex-valued random variables find applications in many areas of science such as signal processing , radio engineering , and atmospheric physics . In this short paper I introduce cmvnorm , a package for investigating one commonly encountered complex-valued probability distribution, the complex Gaussian.

The real multivariate Gaussian distribution is well supported in R by package mvtnorm , having density function

(1)f(x;μ,Σ)=e12(xμ)TΣ1(xμ)|2πΣ|xRn,

where |M| denotes the determinant of matrix M. Here, μ=E[X]Rn is the mean vector and Σ=E[(Xμ)(Xμ)T] the covariance of random vector X; we write XNn(μ,Σ). One natural generalization would be to consider ZNCn(μ,Γ), the complex multivariate Gaussian, with density function

(2)f(z;μ,Γ)=e(zμ)Γ1(zμ)|πΓ|zCn,

where z denotes the Hermitian transpose of complex vector z. Now μCn is the complex mean and Γ=E[(Zμ)(Zμ)] is the complex variance; Γ is a Hermitian positive definite matrix. Note the simpler form of ((2)), essentially due to Gauss’s integral operating more cleanly over the complex plane than the real line:

Cezzdz=xRyRe(x2+y2)dxdy=θ=02πr=0er2rdrdθ=π.

A zero mean complex random vector Z is said to be circularly symmetric  if E[ZZT]=0, or equivalently Z and eiαZ have identical distributions for any αR. Equation ((2)) clearly has this property.

Most results from real multivariate analysis have a direct generalization to the complex case, as long as “transpose” is replaced by “Hermitian transpose”. For example, XNn(0,Σ) implies BXNn(0,BTΣB) for any constant matrix BRm×n, and analogously ZNCn(0,Γ) implies BZNCn(0,BΓB), BCm×n. Similar generalizations operate for Schur complement methods on partitioned matrices.

Also, linear regression generalizes similarly. Specifically, consider yRn. If y=Xβ+ϵ where X is a n×p design matrix, βRp a vector of regression coefficients and ϵNn(0,Σ) is a vector of errors, then β^=(XTΣ1X)1XTΣ1y is the maximum likelihood estimator for β. The complex generalization is to write z=Zβ+ϵ, ZCn×p, βCp, ϵNCn(0,Γ) which gives β^=(ZΓ1Z)1ZΓ1z. Such considerations suggest a natural complex generalization of the Gaussian process.

This short vignette introduces the cmvnorm package which furnishes some functionality for the complex multivariate Gaussian distribution, and applies it in the context of a complex generalization of the emulator package , which implements functionality for investigating (real) Gaussian processes.

2 The package in use

Random complex vectors are generated using the rcmvnorm() function, analogous to rmvnorm():

> set.seed(1)
> library("cmvnorm", quietly = TRUE)
> cm <- c(1, 1i)
> cv <- matrix(c(2, 1i, -1i, 2), 2, 2)
> (z <- rcmvnorm(6, mean = cm, sigma = cv))
                     [,1]                  [,2]
[1,] 0.9680986+0.5525419i  0.0165969+2.9770976i
[2,] 0.2044744-1.4994889i  1.8320765+0.8271259i
[3,] 1.0739973+0.2279914i -0.7967020+0.1736071i
[4,] 1.3171073-0.9843313i  0.9257146+0.5524913i
[5,] 1.3537303-0.8086236i -0.0571337+0.3935375i
[6,] 2.9751506-0.1729231i  0.3958585+3.3128439i

Function dcmvnorm() returns the density according to ((2)):

> dcmvnorm(z, cm, cv)
[1] 5.103754e-04 1.809636e-05 2.981718e-03 1.172242e-03 4.466836e-03 6.803356e-07

So it is possible to determine a maximum likelihood estimate for the mean using direct numerical optimization

> helper <- function(x) c(x[1] + 1i * x[2], x[3] + 1i * x[4]) 
> objective <- function(x, cv) 
+   -sum(dcmvnorm(z, mean = helper(x), sigma = cv, log = TRUE))
> helper(optim(c(1, 0, 1, 0), objective, cv = cv)$par)
[1] 1.315409-0.447863i 0.385704+1.372762i

(helper functions are needed because optim() optimizes over Rn as opposed to Cn). This shows reasonable agreement with the true value of the mean and indeed the analytic value of the MLE, specifically

> colMeans(z)
[1] 1.315426-0.447472i 0.386068+1.372784i

3 The Gaussian process

In the context of the emulator, a (real) Gaussian process is usually defined as a random function η:RpR which, for any set of points {x1,,xn} in its domain D the random vector {η(x1), , η(xn)} is multivariate Gaussian.

It is convenient to specify E[η(x)|β]=h(x)β, that is, the expectation of the process at point xD conditional on the (unknown) vector of coefficients β. Here h:RpRq specifies the q known regressor functions of x=(x1,,xp)T; a common choice is h(x)=(1,x1,,xp)T [giving q=p+1], but one is in principle free to choose any function of x. One writes HT=(h(x1),,h(xn)) when considering the entire design matrix X; the R idiom is regressor.multi().

The covariance is typically given by cov(η(x),η(x))=V(xx), where V:RnR must be chosen so that the variance matrix of any finite set of observations is always positive-definite. Bochner’s theorem  shows that V() must be proportional to the characteristic function (CF) of a symmetric probability Borel measure.

uses techniques which have clear complex analogues to show that the posterior mean of η(x) is given by

(3)h(x)Tβ+(cov(x,x1),,cov(x,xn))TA1(yHβ^).

Here A is an n×n matrix of correlations between the observations, σ2Aij=cov(η(xi),η(xj)) where σ2 is an overall variance term; and β^=(XTA1X)1XTA1y is the maximum likelihood estimator for β.

Equation ((3)) furnishes a cheap approximation to η(x) and is known as the ‘emulator’.

Complex Gaussian processes

The complex case is directly analogous, with η:CpC and βCq. Writing cov(η(z1),,η(zn)) =Ω, so that element (i,j) of matrix Ω is cov(η(zi),η(zj)), we may relax the requirement that Ω be symmetric positive definite to requiring only Hermitian positive definiteness. This allows one to use the characteristic function of any, possibly non-symmetric, random variable Ψ with density function f:RpR and characteristic function ϕ:

Ωij=cov(η(zi),η(zj))=ϕ(zizj).

That Ω remains Hermitian positive definite may be shown by evaluating a quadratic form with it and arbitrary wCn and establishing that it is real and non-negative:

wΩw=i,jwicov(η(zi),η(zj))wj definition of quadratic form=i,jwiϕ(zizj)wj covariance function is the CF of Ψ=i,jwi[tCneiRet(zizj)f(t)dt]wj definition of CF of Ψ=tCn[i,jwieiRet(zizj)wjf(t)]dtintegration and summation commute=tCn[i,jwieiRe(tzi)wjeiRe(tzj)f(t)]dt expand and rearrange=tCn|iwieiRe(tzi)|2f(t)dtalgebra0.integral of sum of real positive functions

(This motivates the definition of the characteristic function of a complex multivariate random variable Z as E[eiRe(tZ)]). Thus the covariance matrix is Hermitian positive definite: although its entries are not necessarily real, its eigenvalues are all nonnegative.

In the real case one typically chooses Ψ to be a zero-mean Gaussian distribution; in the complex case one can use the complex multivariate distribution given in equation ((2)) which has characteristic function

(4)exp(iRe(tμ)14tΓt)

and following  in writing B=Γ/4, we can write the variance matrix as a product of a (real) scalar σ2 term and

(5)c(t)=exp(iRe(tμ)tBt).

Thus the covariance matrix Ω is given by

Ωij=cov(η(zi),η(zj))=σ2c(zizj).

In ((5)), B has the same meaning as in conventional emulator techniques and controls the modulus of the covariance between η(z) and η(z); μ governs the phase.

Given the above, it seems to be reasonable to follow  and admit only diagonal B; but now distributions with nonzero mean can be considered (compare the real case which requires a zero mean). A parametrization using diagonal B and complex mean vector requires 3p (real) hyperparameters; compare 2p if Cp is identified with R2p.

4 Functions of several complex variables

Analytic functions of several complex variables are an important and interesting class of objects; motivates and discusses the discipline. Formally, consider f:CnC, n2 and write f(z1,,zn). Function f is analytic if it satisfies the Cauchy-Riemann conditions in each variable separately, that is f/zj=0, 1jn.

Such an f is continuous (due to a “non-trivial theorem of Hartogs”) and continuously differentiable to arbitrarily high order. goes on to state some results which are startling if one’s exposure to complex analysis is restricted to functions of a single variable: for example, any isolated singularity is removable.

5 Numerical illustration of these ideas

The natural definition of complex Gaussian processes above, together with the features of analytic functions of several complex variables, suggests that a complex emulation of analytic functions of several complex variables might be a useful technique.

The ideas presented above, and the cmvnorm package, can now be used to sample directly from an appropriate complex Gaussian distribution and estimate the roughness parameters:

> val <- latin.hypercube(40, 2, names = c("a", "b"), complex = TRUE)
> head(val)
                  a              b
[1,] 0.7375+0.2375i 0.2375+0.7125i
[2,] 0.6875+0.5875i 0.1375+0.3375i
[3,] 0.4625+0.5375i 0.9875+0.5875i
[4,] 0.7875+0.0625i 0.0625+0.7875i
[5,] 0.3875+0.0375i 0.5875+0.7625i
[6,] 0.2125+0.5625i 0.7625+0.9625i

(function latin.hypercube() is used to generate a random complex design matrix). We may now specify a variance matrix using simple values for the roughness hyperparameters B=(1002) and μ=(1,i)T:

> true_scales <- c(1, 2)
> true_means <- c(1, 1i)
> A <- corr_complex(val, means = true_means, scales = true_scales)
> round(A[1:4, 1:4], 2)
           [,1]       [,2]       [,3]       [,4]
[1,] 1.00+0.00i 0.59-0.27i 0.25-0.10i 0.89+0.11i
[2,] 0.59+0.27i 1.00+0.00i 0.20+0.00i 0.42+0.26i
[3,] 0.25+0.10i 0.20+0.00i 1.00+0.00i 0.10+0.06i
[4,] 0.89-0.11i 0.42-0.26i 0.10-0.06i 1.00+0.00i

Function corr_complex() is a complex generalization of corr(); matrix A is Hermitian positive-definite:

> all(eigen(A)$values > 0)
[1] TRUE

It is now possible to make a single multivariate observation d of this process, using β=(1,1+i,12i)T:

> true_beta <- c(1, 1+1i, 1-2i)
> d <- drop(rcmvnorm(n = 1, mean = regressor.multi(val) %*% true_beta, sigma = A))
> head(d)
[1] 3.212719+1.594901i 1.874278+0.345517i 3.008503-0.767618i 3.766526+2.071882i
[5] 3.712913+0.800983i 3.944167+0.924833i

Thus d is a single observation from a complex multivariate Gaussian distribution. Most of the functions of the emulator package operate without modification. Thus betahat.fun(), which calculates the maximum likelihood estimate β^=(HA1H)1HA1y takes complex values directly:

> betahat.fun(val, solve(A), d)
              const                   a                   b 
0.593632-0.0128655i 0.843608+1.0920437i 1.140372-2.5053751i

However, because the likelihood function is different, the interpolant() functionality is implemented in the cmvnorm package by interpolant.quick.complex(), named in analogy to function interpolant.quick() of package emulator.

For example, it is possible to evaluate the posterior distribution of the process at (0.5,0.3+0.1i), a point at which no observation has been made:

> interpolant.quick.complex(rbind(c(0.5, 0.3+0.1i)), d, val, solve(A), 
+   scales = true_scales, means = true_means, give.Z = TRUE)
$mstar.star
[1] 1.706402-1.008601i

$Z
[1] 0.203295

$prior
[1] 1.608085-0.104419i

Thus the posterior distribution for the process is complex Gaussian at this point with a mean of about 1.711.01i and a variance of about 0.2.

Analytic functions

These techniques are now used to emulate an analytic function of several complex variables. A complex function’s being analytic is a very strong restriction; uses ‘rigidity’ to describe the severe constraint that analyticity represents.

Here the Weierstrass σ-function  is chosen as an example, on the grounds that  consider it to be a typical entire function in a well-defined sense. The elliptic package  is used for numerical evaluation.

graphic without alt text
Figure 1: Visualization of the Weierstrass σ-function, specifically σ(z;2+i,2.2+1.1i) in the region of the complex plane 4Re(z),Im(z)+4; visualization is scheme 13 of .
graphic without alt text
Figure 2: Visualization of the Weierstrass σ-function, specifically σ(6+1i;g1=z,g2=1) in the region of the complex plane 4Re(z),Im(z)+4; visualization is scheme 8 of .

The σ-function takes a primary argument z and two invariants g1,g2, so a three-column complex design matrix is required:

> library("elliptic")
> valsigma <- 2 + 1i + round(latin.hypercube(30, 3, 
+   names = c("z", "g1", "g2"), complex = TRUE)/4, 2)
> head(valsigma)
              z         g1         g2
[1,] 2.17+1.15i 2.09+1.22i 2.21+1.09i
[2,] 2.11+1.01i 2.04+1.03i 2.25+1.15i
[3,] 2.10+1.04i 2.15+1.00i 2.22+1.20i
[4,] 2.13+1.10i 2.24+1.21i 2.01+1.16i
[5,] 2.20+1.00i 2.20+1.06i 2.08+1.08i
[6,] 2.05+1.10i 2.19+1.04i 2.11+1.03i

(an offset is needed because σ(z,g1,g2)=z+O(z5)). The σ-function can now be evaluated at the points of the design matrix:

> dsigma <- apply(valsigma, 1, function(u) sigma(u[1], g = u[2:3]))

One way of estimating the roughness parameters is to use maximum likelihood. The likelihood for any set of roughness parameters is given by  as (σ2)nq2|A|1/2|HTA1H|1/2 with complex generalization (σ2)(nq)|A|1|HA1H|1 which is calculated in the package by function scales.likelihood.complex(); this can be used to return the log-likelihood for a specific set of roughness parameters:

> scales.likelihood.complex(scales = c(1, 1, 2), means = c(1, 1+1i, 1-2i), 
+   zold = valsigma, z = dsigma, give_log = TRUE)
[1] 144.5415

Numerical methods can then be used to find the maximum likelihood estimate. Because function optim() optimizes over Rn, helper functions are again needed which translate from the optimand to scales and means:

> scales <- function(x) exp(x[c(1, 2, 2)])
> means <-  function(x) x[c(3, 4, 4)] + 1i * x[c(5, 6, 6)]

Because the diagonal elements of B are strictly positive, their logarithms are optimized, following ; it is implicitly assumed that the scales and means associated with g1 and g2 are equal.

> objective <- function(x, valsigma, dsigma) 
+   -scales.likelihood.complex(scales = scales(x), means = means(x), 
+     zold = valsigma, z = dsigma)
> start <- c(-0.538, -5.668, 0.6633, -0.0084, -1.73, -0.028)
> jj <- optim(start, objective, valsigma = valsigma,  dsigma = dsigma, 
+   method = "SANN", control = list(maxit = 100))
> (u <- jj$par)
[1] -0.5380 -5.6680  0.6633 -0.0084 -1.7300 -0.0280

Function corr_complex() may now be used to calculate the covariance of the observations:

> Asigma <- corr_complex(z1 = valsigma, scales = scales(u), means = means(u))

So now we can compare the emulator against the “true” value:

> interpolant.quick.complex(rbind(c(2+1i, 2+1i, 2+1i)), zold = valsigma,
+   d = dsigma, Ainv = solve(Asigma), scales = scales(u), means = means(u))
[1] 3.078956+1.259993i
> sigma(2 + 1i, g = c(2 + 1i, 2 + 1i))
[1] 3.078255+1.257819i

showing reasonable agreement. It is also possible to test the hypothesis HR:μR2 (that is, the variance matrix A is real), by calculating the likelihood ratio of the unconstrained model ((5)) to that obtained by HR. This may be achieved by constraining the optimization to satisfy μR2:

> ob2 <- function(x, valsigma, dsigma) 
+   -scales.likelihood.complex(scales = scales(x), means = c(0, 0, 0), 
+     zold = valsigma, z = dsigma)
> jjr <- optim(u[1:2], ob2, method = "SANN", control = list(maxit = 1000), 
+   valsigma = valsigma, dsigma = dsigma)
> (ur <- jjr$par)
[1]  0.2136577 -4.2640825

so the test statistic D is given by

> LR <- scales.likelihood.complex(scales = scales(ur), means = c(0, 0, 0), 
+   zold = valsigma, z = dsigma)
> LC <- scales.likelihood.complex(scales = scales(u), means = means(u), 
+   zold = valsigma, z = dsigma)
> (D <- 2 * (LC - LR))
[1] 22.17611

Observing that D is in the tail region of its asymptotic distribution, χ32, the hypothesis HR may be rejected.

6 Conclusions

The cmvnorm package for the complex multivariate Gaussian distribution has been introduced and motivated. The Gaussian process has been generalized to the complex case, and a complex generalization of the emulator technique has been applied to an analytic function of several complex variables. The complex variance matrix was specified using a novel parameterization which accommodated non-real covariances in the context of circulary symmetric random variables. Further work might include numerical support for the complex multivariate Student t distribution.

CRAN packages used

cmvnorm, mvtnorm, emulator

CRAN Task Views implied by cited packages

Distributions, Finance

Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Footnotes

    References

    K. Chandrasekharan. Elliptic functions. Springer-Verlag, 1985.
    W. Feller. An introduction to probability theory and its applications. Wiley, 1971.
    A. Genz, F. Bretz, T. Miwa, X. Mi, F. Leisch, F. Scheipl and T. Hothorn. mvtnorm: Multivariate normal and t distributions. 2014. URL http://CRAN.R-project.org/package=mvtnorm. R package version 1.0-0.
    N. R. Goodman. Statistical analysis based on a certain multivariate complex Gaussian distribution (an introduction). The Annals of Mathematical Statistics, 34(1): 152–177, 1963.
    R. K. S. Hankin. Cmvnorm: The complex multivariate gaussian distribution. 2015. URL http://CRAN.R-project.org/package=cmvnorm. R package version 1.0-2.
    R. K. S. Hankin. Introducing BACCO, an R bundle for Bayesian analysis of computer code output. Journal of Statistical Software, 14(16): 1–20, 2005.
    R. K. S. Hankin. Introducing elliptic, an R package for elliptic and modular functions. Journal of Statistical Software, 15(7): 1–22, 2006.
    R. K. S. Hankin. Introducing multivator: A multivariate emulator. Journal of Statistical Software, 46(8): 1–20, 2012.
    S. Kay. Modern spectral analysis. Prentice-Hall, 1989.
    S. G. Krantz. What is several complex variables? The American Mathematical Monthly, 94(3): 236–256, 1987.
    J. E. Littlewood and A. C. Offord. On the distribution of zeros and a-values of a random integral function (II). The Annals of Mathematics, 49(4): 885–952, 1948.
    D. P. Mandic, S. Javidi, S. L. Goh, A. Kuh and K. Aihara. Complex-valued prediction of wind profile using augmented complex statistics. Renewable Energy, 34: 196–201, 2009.
    T. Needham. Visual complex analysis. Clarendon Press, Oxford, 2004.
    J. Oakley. Bayesian uncertainty analysis for complex computer codes. 1999.
    L. H. Ozarow. Information theoretic considerations for cellular mobile radio. IEEE Transactions on Vehicular Technology, 43(2): 359–378, 1994.

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Hankin, "The Complex Multivariate Gaussian Distribution", The R Journal, 2015

    BibTeX citation

    @article{RJ-2015-006,
      author = {Hankin, Robin K. S.},
      title = {The Complex Multivariate Gaussian Distribution},
      journal = {The R Journal},
      year = {2015},
      note = {https://rjournal.github.io/},
      volume = {7},
      issue = {1},
      issn = {2073-4859},
      pages = {73-80}
    }