The mpoly package is a general purpose collection of tools for symbolic computing with multivariate polynomials in R. In addition to basic arithmetic, mpoly can take derivatives of polynomials, compute bases of collections of polynomials, and convert polynomials into a functional form to be evaluated. Among other things, it is hoped that mpoly will provide an R-based foundation for the computational needs of algebraic statisticians.
At a basic level, R is not designed for symbolic computing. Unlike computer algebra systems founded on rational arithmetic and representations like Mathematica (Wolfram Research 2012) and Maple (Maplesoft 2012), R’s dependence on floating-point arithmetic naturally lends itself to its purpose—data analysis, numerical optimization, large(ish) data and so on. Nevertheless, the advent of algebraic statistics and its contributions to asymptotic theory in statistical models, experimental design, multi-way contingency tables, and disclosure limitation has increased the need for R – as a practical tool for applied statistics – to be able to do some relatively basic operations and routines with multivariate polynomials (Diaconis and Sturmfels 1998; Drton et al. 2009). mpoly is designed to try to fill this void entirely within R (Kahle 2012). This article is intended to demonstrate its capabilities to a discipline/specialty-neutral audience.
A bit more specifically, the mpoly package is a general purpose collection of tools for symbolic computing with multivariate polynomials in R. While new, it is not the first package proposed to deal with multivariate polynomials. We therefore begin with a discussion of the current package intended to fulfill this need, multipol, in order to motivate the need for and capabilities of mpoly (Hankin 2010).
R currently has three packages which deal with polynomials – polynom, PolynomF, and multipol. The first two (the second being a reboot of the first) deal exclusively with univariate polynomials and as such are not suitable for the general purpose needs cited in the introduction (Venables et al. 2009; Venables 2010). The third, multipol, implements multivariate polynomials in a natural way suitable for some traditional applications (e.g. simple interactions in generalized linear models) but exhibits two limitations which impede its more general use and suggest a fresh re-envisioning of multivariate polynomials in R is needed (Hankin 2008). The first limitation has to do with the impossibility of polynomial arithmetic, and the second has to do with storing sparse polynomials.
multipol is an implementation of multivariate polynomials based on the
S3 class object "multipol"
, an unnamed multidimensional array.
Consequently, multipol is fundamentally dependent on arrays as its
basic data structure. This is evident in the construction of a
"multipol"
object:
> library("multipol")
: abind
Loading required package
: 'multipol'
Attaching package
object(s) are masked from
The following 'package:base':
single
> a <- as.multipol(array(1:12, c(2,3,2)))
> a
^0
, , z
^0 y^1 y^2
y^0 1 3 5
x^1 2 4 6
x
^1
, , z
^0 y^1 y^2
y^0 7 9 11
x^1 8 10 12 x
In normal notation, a moment’s consideration reveals that the polynomial
defined above is
\[1 + 2x + 3y + 4xy + 5y^2 + 6xy^2 + 7z + 8xz + 9yz + 10xyz + 11y^2z + 12xy^2z\]
in the ring \(R[x,y,z]\). (A similar pretty printing can be enabled by
setting options(showchars = TRUE)
.) Thus, the elements of the array
are the coefficients on the terms appearing in the polynomial. From
here, the limitations are immediately accessible.
1. Notice that the definition of a multivariate polynomial in the
example above makes no reference to the variable names x
and
y
—they are determined when the print
method is called. In other
words, a "multipol"
object has no notion of variables whatsoever, the
variables are only generated when asked for via the print
method for
class "multipol"
. This has profound implications on polynomial
arithmetic in multipol. In particular, arithmetic is only possible
if two polynomials have the same number of dimensions (variables), and
is only meaningful if those dimensions represent the same variables in
the same order, the latter of which is never checked because there are
no names specified which can be checked.
For example, suppose one wants to add or multiply the polynomials
\[a = 1 + 2x + 3y + 4xy \tag{1} \]
and
\[b = 1 + 2x + 3y + 4xy + 5z + 6xz + 7yz + 8xyz, \tag{2} \]
both in \(R[x,y,z]\). Mathematically speaking, it is obvious that the sums and products are well-defined. However, they cannot be computed because they have a different number of variables:
> a <- multipol( array(1:4, c(2,2)) )
> b <- multipol( array(1:8, c(2,2,2)) )
> a + b
: length(dima) == length(dimb) is not TRUE
Error> a * b
: length(dim(a)) == length(dim(b)) is not TRUE Error
If the polynomials have the same number of variables, the arithmetic may be incorrect if those variables are different:
> options(showchars = TRUE)
> ( a <- multipol( as.array( c("x^0" = 0, "x^1" = 1) )))
1] 1*x^1
[> ( b <- multipol( as.array( c("y^0" = 0, "y^1" = 1) )))
1] 1*x^1
[> a + b
1] 2*x^1 [
The result would seem to indicate that \(a + b = x + y = 2x\), which is clearly in error. The location of the problem can easily be found however since we printed the polynomials at each step—regardless of our attempt to force (in a simple way) the labeling of the variables, we achieve the same incorrect result.
2. The array representation does not allow for sparse polynomials. For our purposes, a sparse (multivariate) polynomial of multidegree \(\mathbf{d}\) is one which has "few" terms \(c_{\mathbf{d}'}\mathbf{x}^{\mathbf{d}'}\) with multidegree \(\mathbf{d}' \leq \mathbf{d}\) (element-wise) with nonzero coefficients. As an example, consider the polynomial
\[a b^2 + b c^2 + c d^2 + \cdots + y z^2 + z a^2. \tag{3} \]
Since multipol represents multivariate polynomials with arrays, the
representation requires a 26 dimensional array (a dimension for each
variable) each with three levels (e.g. a^0
, a^1
, a^2
). This
amounts to an array with \(3^{26} = 2,541,865,828,329\) cells, all but 26
of which are nonzero—a whopping 20.33TB of space if the coefficients
are stored in a double-precision floating point format.
This section has introduced two shortcomings of multipol which significantly limit its potential use for most practical applications. It should be noted, however, that multipol was not intended for use with "large" polynomials. It was built with enumerative combinatorics in mind, and its inefficiency with larger polynomials was readily acknowledged in Hankin (2008), where it is suggested that perhaps a sparse array representation (as is available in Mathematica) or outsourcing to C++ routines could alleviate the burdens of the array representation. Its inefficiency for different purposes should therefore not be considered a flaw so much as outside the package’s scope. Nevertheless, the arithmetic and storage problems caused by the array representation are fatal limitations for any problem which deals with large polynomial rings (i.e., ones with many variables). Enter mpoly.
mpoly is a complete reenvisioning of how multivariate polynomials and symbolic computing with multivariate polynomials should be implemented in R. Unlike multipol, mpoly uses as its most basic data structure the list. This fundamental change allows us to dispense of both issues with multipol discussed in the previous subsection.
In mpoly, an "mpoly"
object is an S3 class object which is a list.
The elements of the list are each named numeric vectors, with unique
names including coef
. Naturally, each element of an "mpoly"
object
corresponds to a term in the polynomial. The names of the elements
correspond to the variables, and the values to the degrees; coef
is
the coefficient of the term. Constructing an "mpoly"
object in this
way is very straightforward using the constructor mpoly
.
> library("mpoly")
> termList <- list(
+ c(coef = 1),
+ c(x = 10, coef = 2),
+ c(x = 2, coef = 3),
+ c(y = 5, coef = 4),
+ c(x = 1, y = 1, coef = 5))
> termList
1]]
[[
coef 1
2]]
[[
x coef 10 2
3]]
[[
x coef 2 3
4]]
[[
y coef 5 4
5]]
[[
x y coef 1 1 5
> poly <- mpoly(termList)
> class(poly)
1] "mpoly" [
Unlike multivariate polynomials in multipol, those in mpoly not only
have variable names but also an intrinsic variable order which is taken
to be the unique names of elements of the "mpoly"
object minus
coef
.1 This can be seen with the vars
function.
> vars(poly)
1] "x" "y" [
Viewing a multivariate polynomial as a list is a cumbersome task. To
make things easier, a print
method for "mpoly"
objects exists and is
dispatched when the object is queried by itself.
> poly
1 + 2 x^10 + 3 x^2 + 4 y^5 + 5 x y
One of the important considerations in polynomial algebra is the
ordering of the terms of a multivariate polynomial. Notice the order of
the terms presented in the printed version of the "mpoly"
object; it
is the order in which the terms are coded in the "mpoly"
object
itself. This can be changed in either of two ways.
First, it can be changed via the print
method, which accepts arguments
order
for the total order used (lexicographic, graded lexicographic,
and graded reverse lexicographic) and varorder
for a variable order
different than the intrinsic order. When an order is requested but a
variable order is not specified, the method messages the user to alert
them to the intrinsic variable order being used.2
> print(poly, order = "lex")
- x, y
using variable ordering 2 x^10 + 3 x^2 + 5 x y + 4 y^5 + 1
> print(poly, order = "grlex")
- x, y
using variable ordering 2 x^10 + 4 y^5 + 3 x^2 + 5 x y + 1
> print(poly, order = "lex", varorder = c("y", "x"))
4 y^5 + 5 y x + 2 x^10 + 3 x^2 + 1
> print(poly, order = "glex", varorder = c("y", "x"))
2 x^10 + 4 y^5 + 5 y x + 3 x^2 + 1
Second, the elements of the "mpoly"
object can be reordered to create
a new "mpoly"
object using the reorder
method.
> poly
1 + 2 x^10 + 3 x^2 + 4 y^5 + 5 x y
> (poly2 <- reorder(poly, order = "lex"))
- x, y
using variable ordering 2 x^10 + 3 x^2 + 5 x y + 4 y^5 + 1
> unclass(poly2)
1]]
[[
x coef 10 2
2]]
[[
x coef 2 3
3]]
[[
x y coef 1 1 5
4]]
[[
y coef 5 4
5]]
[[
coef 1
mpoly
vs mp
The major workhorse of the package is the constructor mpoly
itself. In
particular, polynomial (combining of like terms) and (combining of
coefficients and like variables within terms) are both performed when
the multivariate polynomials are constructed with mpoly
.
> termList <- list( c(x = 1, coef = 1), c(x = 1, coef = 2) )
> mpoly(termList) # x + 2x
3 x
> termList <- list( c(x = 5, x = 2, coef = 5, coef = 6, y = 0) )
> mpoly(termList) # x^5 * x^2 * 5 * 6 * y^0
30 x^7
While the constructor mpoly
is nice, it is inconvenient to have to
keep specifying lists in order to define polynomials. The mp
function
was designed for this purpose and is intended to make defining
multivariate polynomials quick and easy by taking them in as character
strings and parsing them into "mpoly"
objects.
> ( p <- mp("10 x + 2 y 3 + x^2 5 y") )
10 x + 6 y + 5 x^2 y
> is(p, "mpoly")
1] TRUE
[> unclass(p)
1]]
[[
x coef 1 10
2]]
[[
y coef 1 6
3]]
[[
x y coef 2 1 5
This parsing is a nontrivial process and depends heavily on the
specification of the polynomial in the string. The mp
function must
first determine the variables that the user is specifying (which must be
separated by spaces for disambiguation) and then construct the list to
send to mpoly
to construct the object. Because it is passed through
mpoly
, the "mpoly"
object returned by mp
is reduced and
regularized.
> mp("x^2 + 10 x 6 x + 10 x 6 x y y 2")
61 x^2 + 120 x^2 y^2
mpoly has much more to offer than simply defining polynomials. Methods
are available for addition, subtraction, multiplication, exponentiation
and equality as well. Moreover, since "mpoly"
objects know their
variable names intrinsically, we can perform arithmetic with whichever
polynomials we like. For example, the arithmetic with \(a\) and \(b\) from
((1)) and ((2)) is easy –
> a <- mp("1 + 2 x + 3 y + 4 x y")
> b <- mp("1 + 2 x + 3 y + 4 x y + 5 z + 6 x z + 7 y z + 8 x y z")
> a + b
2 + 4 x + 6 y + 8 x y + 5 z + 6 x z + 7 y z + 8 x y z
> b - a
5 z + 6 x z + 7 y z + 8 x y z
> a * b
1 + 4 x + 6 y + 20 x y + 5 z + 16 x z + 22 y z + 60 x y z + 4 x^2
+ 16 x^2 y + 12 x^2 z + 40 x^2 y z + 9 y^2 + 24 x y^2 + 21 y^2 z
+ 52 x y^2 z + 16 x^2 y^2 + 32 x^2 y^2 z
Exponentiation and equality are also available.
> a^2
1 + 4 x + 6 y + 20 x y + 4 x^2 + 16 x^2 y + 9 y^2 + 24 x y^2
+ 16 x^2 y^2
> a == b
1] FALSE
[> ( c <- mpoly(a[c(2,1,4,3)]) ) # reorder a
4 y x + 3 y + 2 x + 1
> a == c
1] TRUE [
Here also each of the results are reduced and regularized. While the computations are done entirely in R, they are quite efficient; each of the above calculations is virtually instantaneous.
But mpoly does not stop there. The basic operations of the
differential calculus, partial differentiation and gradients, are also
available to the user and are efficient. A deriv
method exists for
"mpoly"
objects which can be dispatched,3 and the gradient
function is built on deriv
to compute gradients.
> deriv(b, "x")
8 y z + 4 y + 6 z + 2
> gradient(b)
8 y z + 4 y + 6 z + 2
8 x z + 4 x + 7 z + 3
8 x y + 6 x + 7 y + 5
The gradient is a good example of another class object in the mpoly
package, the "mpolyList"
. "mpolyList"
objects are simply lists of
"mpoly"
objects and are used to hold vectors of multivariate
polynomials. They can be easily specified using the mp
function on a
vector of character strings.
> ( ps <- mp(c("x + y + z", "x + z^2")) )
+ y + z
x + z^2
x > str(ps, 1)
2
List of $ :List of 3
- attr(*, "class")= chr "mpoly"
..$ :List of 2
- attr(*, "class")= chr "mpoly"
..- attr(*, "class")= chr "mpolyList"
The viewing of "mpolyList"
objects is made possible by a print
method for "mpolyList"
objects just like for "mpoly"
objects.
Moreover addition, subtraction, and multiplication are defined for
"mpolyList"
objects as well; they each operate element-wise.
In algebraic geometry, a basis of a polynomial ideal (or collection of
polynomials generating an ideal) is a collection of polynomials which
generates the same ideal and exhibits other nice properties, such as
zero remainders for polynomials in the ideal and unique remainders for
those outside it. They are foundational to the study of polynomial
ideals and their associated varieties. In addition to differentiation,
mpoly can also compute bases of collections of multivariate
polynomials ("mpolyList"
) by passing the proper objects in the proper
syntax to the rSymPy
package which has an implementation of Buchberger’s algorithm
(Grothendieck and Bellosta 2010).4 The computations are performed by a Java based Python
implementation and are quite fast once the Java Virtual Machine (JVM)
has been initialized. The basis is then returned as an "mpolyList"
object.
> polys <- mp(c("t^4 - x", "t^3 - y", "t^2 - z"))
> gb <- grobner(polys)
- t, x, y, z
using variable ordering : rJava
Loading required package> gb
-1 z + t^2
- z^2
t y -1 y + z t
- z^2
x ^2 - z^3
y> class(gb)
1] "mpolyList" [
Moreover, grobner
can calculate bases with respect to various monomial
orders and any variable ordering.
> polys <- mp(c("x^2 - 2 y^2", "x y - 3"))
> grobner(polys, varorder = c("x", "y"))
3 x - 2 y^3
-9 + 2 y^4
> grobner(polys, varorder = c("x", "y"), order = "grlex")
-3 x + 2 y^3
^2 - 2 y^2
x-3 + x y
The task of computing bases is the only job mpoly outsources from R. This is because implementations of bases algorithms are (1) complex and highly prone to error, (2) scarce and therefore difficult to emulate, and (3) highly iterative; thus it was thought best to leave the routine to more expert programmers and frameworks outside R. Unfortunately, there is currently no walk algorithm available to convert a basis in one monomial order to a basis in another, a technique often used to quickly compute bases in more difficult orders (e.g. lexicographic) from "easier" ones (e.g. graded reverse lexicographic), so there are still a number of improvements which can be made.
Apart from being interesting algebraic objects, polynomials can of
course be thought of as functions. To access this functional
perspective, we can convert an "mpoly"
or "mpolyList"
object to a
function
using the "mpoly"
method of as.function
. This is
particularly suited to R’s strengths since R is geared towards
evaluation and numerical optimization
> library("ggplot2"); theme_set(theme_bw())
> ( p <- mp("x") * mp("x - .5") * mp("x - 1") ) # x(x-.5)(x-1)
^3 - 1.5 x^2 + 0.5 x
x> f <- as.function(p)
f(x)
> f
function(x){x**3 - 1.5 * x**2 + 0.5 * x}
<environment: 0x1218bc270>
> f(10)
1] 855
[> s <- seq(-.1, 1.1, length.out = 201)
> df <- data.frame(x = s, y = f(s))
> qplot(x, y, data = df, geom = "path") + geom_hline(yintercept = 0, colour = I("red"))
The plot generated is included in Figure 1, where one can see that the function has the correct zeros.
For multivariate polynomials the syntax is the same, and as.function
can provide the function with (1) a vector argument (ideal for passing
to optimization routines such as optim
) or (2) a sequence of
arguments.
> mpoly <- mp("x + 3 x y + z^2 x")
>
> f <- as.function(mpoly)
f(.) with . = (x, y, z)
> f(1:3)
1] 16
[>
> f <- as.function(mpoly, vector = FALSE)
f(x, y, z)
> f(1, 2, 3)
1] 16 [
"mpolyList"
objects can be converted into functions in the same way.
Note that in both cases the user is messaged with the appropriate syntax
for the resulting function.
> polys <- mp(c("x + 1", "y^2 + z"))
> f <- as.function(polys)
f(.) with . = (x, y, z)
> f(1:3)
1] 2 7 [
Another neat example for illustrating this is visualizing Rosenbrock’s "banana" function, \[f(x,y) = (1-x)^2 + 100(y-x^2)^2,\] which is a common test example for optimization routines (Figure 2).
> f <- mp("1 - 2 x + x^2 + 100 x^4 - 200 x^2 y + 100 y^2")
> f <- as.function(f)
> df <- expand.grid(x = seq(-2, 2, .01), y = seq(-1, 3, .01))
> df$f <- apply(df, 1, f)
> library("scales")
> qplot(x, y, data = df, geom = c("raster", "contour"),
+ fill = f + .001, z = f, colour = I("white"), bins = 6) +
+ scale_fill_gradient2(
+ low = "blue", mid = "green", high = "red",
+ trans = "log10", guide = "none"
+ )
While other functionality is also provided (such as a terms
method, a
function to compute least common multiples, and some combinatorial
functions), the presented ones are the main contributions of the mpoly
package. Put together, they provide a package which is not only
user-friendly, efficient, and useful for polynomial algebra, but also a
solid foundation upon which further developments can be made to make
polynomials more accessible in R. In particular, it provides a nice
starting point for any future package dealing with algebraic statistics.
There are, however, several improvements which can be made. First, the
primary constructor function mpoly
can be made significantly faster by
outsourcing its job to C++, as previously suggested (Hankin 2008).
Second, improvements can be made to speed up some of the arithmetic, for
example addition-chain exponentiation for polynomial exponents (as
opposed to iterative multiplication). Last, improvements can be made
into the flexibility of the facilitated constructor mp
. In particular,
parenthetical expressions are not currently handled but would be of
great practical use.
The author would like to thank Martyn Plummer and two anonymous reviewers for useful suggestions which made this and future work better.
mpoly, multipol, polynom, PolynomF, rSymPy
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Kahle, "mpoly: Multivariate Polynomials in R", The R Journal, 2013
BibTeX citation
@article{RJ-2013-015, author = {Kahle, David}, title = {mpoly: Multivariate Polynomials in R}, journal = {The R Journal}, year = {2013}, note = {https://rjournal.github.io/}, volume = {5}, issue = {1}, issn = {2073-4859}, pages = {162-170} }