Our Google Summer of Code project “Improvements to `nls()`

” investigated rationalizing R tools for nonlinear regression and nonlinear estimation tools by considering usability, maintainability, and functionality, especially for a Gauss-Newton solver. The rich features of `nls()`

are weakened by several deficiencies and inconsistencies such as a lack of stabilization of the Gauss-Newton solver. Further considerations are the usability and maintainability of the code base that provides the functionality `nls()`

claims to offer. Various packages, including our `nlsr`

, provide alternative capabilities. We consider the differences in goals, approaches, and features of different tools for nonlinear least squares modeling in R. Discussion of these matters is relevant to improving R generally as well as its nonlinear estimation tools.

`nls()`

function`nls()`

is a Comprehensive R Archive Network (CRAN: *https://cran.r-project.org*)
tool that has remarkable and wide-ranging features for estimating nonlinear
statistical models that are expressed as **formulas**.
The function dates to the 1980s and the work related to Bates and Watts (1988) in S (see
*https://en.wikipedia.org/wiki/S_%28programming_language%29*).

In particular, we note that it

- can call calculations in other programming languages
- allows weighted or subset data
- can estimate bound-constrained parameters
- provides partially linear model handling mechanism
- permits parameters to be indexed over a set of related data
- produces measures of variability (i.e., standard error estimates) for the estimated parameters
- has related profiling capabilities for exploring the likelihood surface as parameters are changed
- links to many pre-coded (
`selfStart`

) models that do not require initial parameter values.

Due to its extensive range of features and prolonged history, the code has become untidy and overly patched, making it challenging to maintain and ripe for improvement in its underlying methods.

Besides the base-R `nls()`

function, we will pay particular attention
to **nlsr** (Nash and Murdoch (2023)),
**minpack.lm** (Elzhov et al. (2012)), and **gslnls** (Chau (2023)) which are
general nonlinear least-squares solvers in the CRAN repository.
While we will provide capsule comments for some other CRAN packages,
those in the Bioconductor (Gentleman et al. (2004)) collection are more
specialized, and those on repositories such as
GitHub (https://github.com) and Gitlab (https://about.gitlab.com),
while interesting, do not have the checking applied to CRAN packages.

Our work aimed at unifying nonlinear modeling functionality in R,
ideally in a refactored `nls()`

function. The primary messages from this work
are:

For R

**users**, we would advise that it is most efficient to carry out nonlinear modeling or least squares by adapting working scripts, preferably those with documentation and using recent tools. If there is a suspicion that there may be ill-conditioning, package`nlsr`

or the example we give in the section “Comparison notes for formula-setup solutions” below of how to find singular values of the Jacobian allow these diagnostics to be calculated.For R

**developers**, we invite and encourage discussion of the design choices, since these have downstream implications for ease of use, adaptation to new features, and efficiency of ongoing maintenance.

`onls`

(Spiess, Andrej-Nikolai (2022)) is used for optimising and estimating nonlinear models
by minimizing the sum of squares of **orthogonal** residuals rather than vertical
residuals. The objective is therefore different and involves nontrivial extra
calculation. A vignette with the package and the blog article
https://www.r-bloggers.com/2015/01/introducing-orthogonal-nonlinear-least-squares-regression-in-r/
give some description with illustrative graphs. `onls`

appears to be limited to
problems with one independent and one dependent variable. The Wikipedia article
https://en.wikipedia.org/wiki/Total_least_squares presents an overview of some
ideas, with references to the literature. The approach needs a wider
discussion and tutorial examples to allow its merits to be judged than can be
included here.

`crsnls`

(Tvrdík (2016)) – This package allows nonlinear estimation by
controlled random search via two methods. There is unfortunately no vignette.
A modest trial we carried out showed `nlsr::nlxb()`

gave the same results in
a small fraction of the time required by either of the methods in `crsnls`

. The
method discussed in Josef Tvrdík and Ivan Křivý and Ladislav Mišík (2007) claims better reliability in finding solutions
than a Levenberg-Marquardt code (actually from Matlab), but the tests were conducted
on the extreme NIST examples mentioned next.

`NISTnls`

(Bates (2012)) – This package provides R code and data for a set of
(numerically ill-conditioned) nonlinear least squares problems from the U.S. National
Institute for Standards and Technology. These may not represent real-world situations.

`nlshelper`

(Duursma (2017)) – This package, which unfortunately lacks a
vignette, provides a few utilities for summarizing, testing, and plotting
non-linear regression models estimated with nls(), nlsList() or nlme() that
are linked or grouped in some way.

`nlsic`

(Sokol (2022)) – This solves nonlinear least squares problems with optional
equality and/or inequality constraints. It is clearly not about modeling, and
the input and output are quite different from class `nls`

methods. However, there do
not appear to be other R packages with these capabilities.

`nlsMicrobio`

(Baty and Delignette-Muller (2014)) – Data sets and nonlinear regression models
dedicated to predictive microbiology, including a vignette, by authors of the
`nlstools`

package.

`nlstools`

(Baty and Delignette-Muller (2013)) – This package provides several tools for aiding the estimation
of nonlinear models, particularly using `nls()`

. The vignette is actually a journal
article, and the authors have considerable experience in the subject.

`nlsmsn`

(Prates et al. (2021)) – Fit univariate non-linear scale mixture of skew-normal (NL-SMSN)
regression, with details in Garay, Lachos, and Abanto-Valle (2011). The problem
here is to minimize an objective that is modified from the traditional sum of squared
residuals.

`nls.multstart`

(Padfield and Matheson (2020)) – Non-linear least squares regression using
AIC scores with the Levenberg-Marquardt algorithm using multiple starting values
for increasing the chance that the minimum found is the global minimum.

`nls2`

(Grothendieck (2022)) – Nonlinear least squares by brute force has similar
motivations to `nls.multstart`

, but uses `nls()`

within multiple trials. The author
has extensive expertise in R.

`nlstac`

(Rodriguez-Arias et al. (2020)) – A set of functions implementing the algorithm described in
Torvisco et al. (2018) for fitting separable nonlinear regression curves. The
special class of problem for which this package is intended is an important and
difficult one. No vignette is provided, unfortunately.

`easynls`

– Fit and plot some nonlinear models. Thirteen models are treated, but
there is minimal documentation and no vignette. Package `nlraa`

is to be preferred.

`nlraa`

(Miguez (2021)) – a set of nonlinear *selfStart* models, primarily from
agriculture. Most include analytic Jacobian code.

`optimx`

(Nash and Varadhan (2011)) – This provides
optimizers that can be applied to minimize a nonlinear function which could be
a nonlinear sum of squares. Not generally recommended if nonlinear
least squares programs can be easily used, but provides a check and alternative
solvers.

The Hobbs weed infestation problem (Nash (1979 120)) is a growth-curve modeling task. Its very succinct statement provides the “short reproducible example” much requested on R mailing lists. The problem is small and seemingly straightforward, yet presents such difficulties that optimization researchers have asked if it is contrived rather than a real problem from a field researcher. The data and graph follow.

```
weed <- c(5.308, 7.24, 9.638, 12.866, 17.069, 23.192, 31.443,
38.558, 50.156, 62.948, 75.995, 91.972)
tt <- 1:12
weeddf <- data.frame(tt, weed)
plot(weeddf, main="Hobbs weed infestation data")
```

While model estimation is increasingly automated, nonlinear regression
should use background knowledge and graphs to obtain an understanding of
the general magnitude of the parameters. Indeed, the “visual fitting” approach
(Nash and Velleman (1996)) mentioned later accords with this viewpoint, as does use
of the **Logistic3T** variant of the model, as well as discussions in Ross (1990),
Seber and Wild (1989), Bates and Watts (1988), and Gallant (1987).
Our emphasis on the software resilience to starting parameters that are, in the
sense of statistical modeling, “silly” comes from over half a century of dealing
with users whose interests and understanding are very far from those of statistical
modelers. Thus, we seek methods and codes that obtain reasonable answers under
highly unfavourable conditions. Nevertheless, a proper approach to
nonlinear modeling is to apply all available knowledge to the task.

Three suggested models for this data are (with names to allow for easy reference)

**Logistic3U**:
\[ y \approx b_1 / (1 + b_2 * exp(- b_3 * t)) \]

**Logistic3S**:
\[ y \approx 100 * c_1 / (1 + 10 * c_2 * exp(- 0.1 * c_3 * t)) \]

**Logistic3T**:
\[ y \approx Asym / (1 + exp((xmid - t)/scal)) \]

where we will use `weed`

for \(y\) and `tt`

for \(t\).
The functions above are equivalent, but the first is generally more awkward
to solve numerically due to its poor scaling. The parameters of the three forms
are related as follows:

\[ Asym = b_1 = 100 * c_1 \] \[ exp(xmid/scal) = b_2 = 10 * c_2 \] \[ 1/scal = b_3 = 0.1*c3 \] To allow for a simpler discussion, let us say that the parameters form a (named) vector \(p\) and the model function is called \(model(p)\) with \(r = y - model(p)\).

We wish to minimize the sum of squared residuals, which is our loss (or objective) function. Starting with some guess for the parameters, we aim to alter these parameters to obtain a smaller loss function. We then iterate until we can make no further progress.

Let us consider there are \(n\) parameters and \(m\) residuals.
The loss function is
\[ S(p) = r' r = \sum_{i=1}^m { r_i^2 }\]
The gradient of \(S(p)\) is
\[ g = 2 * J' r\]
where the Jacobian \(J\) is given by elements
\[ J_{i,j} = \partial r_i / \partial p_j \]
and the Hessian is defined by elements
\[ H_{i,j} = \partial ^2 S(p) / {\partial p_i \partial p_j} \]
If we expand the Hessian for nonlinear least squares problems, we find
\[ 0.5 * H_{i,j} = \sum_{k = 1}^m{ J_{k,i} J_{k,j}} + \sum_{k = 1}^m {r_k * \partial{r_k}/{\partial p_i \partial p_j}}\]
Let us use \(D_{i,j}\) for the elements of the second term of this expression.
What is generally called **Newton’s method** for function minimization tries
to set the gradient to zero (to find a stationary point of the function \(S(p)\)).
This leads to **Newton’s equation**
\[ H \delta = -g \]
Given a set of parameters \(p\), we solve this equation for \(\delta\), adjust \(p\)
to \(p + \delta\) and iterate, hopefully to converge on a solution.
Applying this to a sum of squares problem gives
\[ 0.5 * H \delta = (J'J + D) \delta = - J' r\]
In this expression, only the elements of \(D\) have second partial derivatives.
Gauss, attempting to model planetary orbits, had small residuals, and noted
that these multiplied the second partial derivatives of \(r\), so he approximated
\[ 0.5 * H \approx J' J\]
by assuming \(D \approx 0\). This results in the Gauss-Newton method where we
solve
\[ J' J \delta = - J' r\]
though we can avoid some loss of accuracy by not forming the inner product matrix
\(J' J\) and solving the linear least squares matrix problem
\[ J \delta \approx -r \]
by one of several matrix decomposition methods.

In reality, there are many problems where \(D\) should not be ignored, but the work to compute it precisely is considerable. Many work-arounds have been proposed, of which the Levenberg-Marquardt stabilization (Levenberg (1944), Marquardt (1963)) is the most commonly used. For convenience, we will use “Marquardt”, as we believe he first incorporated the ideas into a practical computer program.

The usual suggestion is that \(D\) be replaced by a multiple of the unit matrix or else a multiple of the diagonal part of \(J' J\). In low precision, some elements of \(J' J\) could underflow to zero (Nash (1977)), and a linear combination of both choices is an effective compromise. Various choices for \(D\), as well as a possible line search along the direction \(\delta\) rather than a unit step (Hartley (1961)), give rise to several variant algorithms. “Marquardt’s method” is a family of methods. Fortunately, most choices work well.

Let us specify in R the three model formulas and set some starting values for parameters. These starting points are not equivalent and are deliberately crude choices. Workers performing many calculations of a similar nature should try to provide good starting points to reduce computation time and avoid finding a false solution.

```
# model formulas
frmu <- weed ~ b1 / (1 + b2 * exp(-b3 * tt))
frms <- weed ~ 100 * c1/(1 + 10* c2* exp(-0.1 * c3* tt))
frmt <- weed ~ Asym / (1 + exp((xmid - tt) / scal))
#
# Starting parameter sets
stu1 <- c(b1 = 1, b2 = 1, b3 = 1)
sts1 <- c(c1 = 1, c2 = 1, c3 = 1)
stt1 <- c(Asym = 1, xmid = 1, scal = 1)
```

One of the useful features of `nls()`

is the possibility of a `selfStart`

model,
where starting parameter values are not required. However, if a `selfStart`

model
is not available, `nls()`

sets all the starting parameters
to 1. This is tolerable but could be improved by using a set of values
that are all slightly different, which, in the case of the example
model \(y \,\sim\, a \,*\, exp(-b \,*\, x) + c\,*\,exp(-d \,*\, x)\)
would avoid a singular Jacobian because \(b\) and \(d\) were equal in value.
Program modifications to give a sequence like 1.0, 1.1, 1.2, 1.3 for the four
parameters are fairly obvious.

It is also possible to provide R functions for the residual and Jacobian. This is usually much more work for the user if the formula setup is possible. To illustrate, we show the functions for the unscaled 3 parameter logistic. The particular form of these explicit residual and Jacobian functions comes from their translation from BASIC codes of the 1970s, as adapted in Nash and Walker-Smith (1987). The use of the Laplace form of the residual and the inclusion of data within the functions reflects choices of that era that are at odds with current practice. Some users still want or need to provide problems as explicit functions, particularly for problems that are not regressions. For example, the Rosenbrock banana-valley test problem can be provided this way, where the two “residuals” are different functional forms.

```
# Logistic3U
hobbs.res <- function(x){ # unscaled Hobbs weeds problem -- residual
if(length(x) != 3) stop("hobbs.res -- parameter vector n!=3")
y <- c(5.308, 7.24, 9.638, 12.866, 17.069, 23.192, 31.443,
38.558, 50.156, 62.948, 75.995, 91.972)
tt <- 1:12
res <- x[1] / (1 + x[2] * exp(-x[3] * tt)) - y
# Note: this form of the residual, from Laplace (1788) in the form
# of "fitted - observed" has been used in our software for half
# a century, but sometimes concerns users of the more traditional
# "observed - fitted" form
}
hobbs.jac <- function(x) { # unscaled Hobbs weeds problem -- Jacobian
jj <- matrix(0.0, 12, 3)
tt <- 1:12
yy <- exp(-x[3] * tt)
zz <- 1.0 / (1 + x[2] * yy)
jj[tt, 1] <- zz
jj[tt, 2] <- -x[1] * zz * zz * yy
jj[tt, 3] <- x[1] * zz * zz * yy * x[2] * tt
attr(jj, "gradient") <- jj
jj
}
```

Using a formula specification was a principal advantage made
with `nls()`

when it became available in S sometime in the 1980s. It uses a
Gauss-Newton (i.e., unstabilized) iteration with a step reduction line
search. This works very efficiently as long as \(J\) is not ill-conditioned.
Below we see `nls()`

does poorly on the example problem. To save page space,
we use 1-line result display functions from package `nlsr`

, namely `pnls()`

and `pshort()`

.

```
Error in nls(formula = frmu, start = stu1, data = weeddf) :
singular gradient
```

```
Error in nls(formula = frms, start = sts1, data = weeddf) :
singular gradient
```

```
Error in nls(formula = frmt, start = stt1, data = weeddf) :
singular gradient
```

Here we see the infamous “singular gradient” termination message of `nls()`

.
Users should, of course, be using the `SSlogis`

*selfStart* tool, but ignorance
of this possibility or a slight variant in the model can easily lead to outcomes
similar to those seen here.

```
unlx1:residual sumsquares = 2.5873 on 12 observations
after 19 Jacobian and 25 function evaluations
name coeff SE tstat pval gradient JSingval
b1 196.186 11.31 17.35 3.167e-08 -4.859e-09 1011
b2 49.0916 1.688 29.08 3.284e-10 -3.099e-08 0.4605
b3 0.31357 0.006863 45.69 5.768e-12 2.305e-06 0.04714
```

`snlx1 -- ss= 2.5873 : c1 = 1.9619 c2 = 4.9092 c3 = 3.1357; 34 res/ 23 jac`

`tnlx1 -- ss= 2.5873 : Asym = 196.19 xmid = 12.417 scal = 3.1891; 36 res/ 27 jac`

Though we have found solutions, the Jacobian is essentially singular as
shown by its singular
values. Note that these are displayed by package `nlsr`

in a single column in
the output to provide a compact layout, but the values do not correspond to
the individual parameters in
whose row they appear; they are a property of the whole problem.

`unlm1 -- ss= 2.5873 : b1 = 196.19 b2 = 49.092 b3 = 0.31357; 17 itns`

`snlm1 -- ss= 2.5873 : c1 = 1.9619 c2 = 4.9092 c3 = 3.1357; 7 itns`

```
Error in nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates
```

`ugslnls1 -- ss= 2.5873 : b1 = 196.19 b2 = 49.092 b3 = 0.31357; 25 itns`

`sgslnls1 -- ss= 2.5873 : c1 = 1.9619 c2 = 4.9092 c3 = 3.1357; 9 itns`

`tgslnls1 -- ss= 9205.4 : Asym = 35.532 xmid = 20846 scal = -1745.2; 47 itns`

`nlsr::nlxb()`

uses `print()`

to output standard errors and
singular values of the Jacobian (for diagnostic purposes). By contrast,
`minpack.lm::nlsLM()`

and `nls()`

use `summary()`

, which does not display the sum of squares, while `print()`

gives the sum of squares, but not the standard error of the residuals.

The singular values allow us to gauge how “nearly singular” the Jacobian is at the solution, and the ratio of the smallest to largest of the singular values is a simple but effective measure. The ratios are 4.6641e-05 for Logistic3U, 0.021022 for Logistic3S, and 0.001055 for Logistic3T, so Logistic3S is the “least singular”.

The results from `nlsLM`

and `gsl_nls`

for the transformed model Logistic3T have a very
large sum of squares, which may suggest that these programs have failed.
Since `nls()`

, `nlsLM()`

, and `gsl_nls()`

do not
offer singular values, we need to extract the Jacobian and compute its
singular values. The following script shows how to do this, using as Jacobian what is called
the `gradient`

element in the returned solution for these solvers.

```
# for nlsLM
if (inherits(tnlm1, "try-error")) {
print("Cannot compute solution -- likely singular Jacobian")
} else {
JtnlsLM <- tnlm1$m$gradient() # actually the Jacobian
svd(JtnlsLM)$d # Singular values
}
```

`[1] "Cannot compute solution -- likely singular Jacobian"`

```
# for gsl_nls
if (inherits(tgslnls1, "try-error")) {
cat("Cannot compute solution -- likely singular Jacobian")
} else {
JtnlsLM <- tgslnls1$m$gradient()
svd(JtnlsLM)$d # Singular values
}
```

`[1] 3.4641e+00 9.8541e-09 3.8249e-11`

We see that there are differences in detail, but the more important result is that
two out of three singular values are essentially 0. Our Jacobian is singular, and no
method of the Gauss-Newton type should be able to continue. Indeed, from the parameters
reported at this saddle point, `nlsr::nlxb()`

cannot proceed.

```
stspecial <- c(Asym = 35.532, xmid = 43376, scal = -2935.4)
badstart <- try(nlxb(formula = frmt, start = stspecial, data = weeddf))
if (! inherits(badstart, "try-error")) print(badstart)
```

```
residual sumsquares = 9205.4 on 12 observations
after 2 Jacobian and 2 function evaluations
name coeff SE tstat pval gradient JSingval
Asym 35.5321 NA NA NA -9.694e-09 3.464
xmid 43376 NA NA NA -1.742e-09 2.61e-10
scal -2935.4 NA NA NA -2.4e-08 7.12e-16
```

We illustrate how to solve nonlinear least squares problems using
a function to define the residual. Note that `gsl_nls()`

requires
a vector `y`

that is the length of the vector returned by the
function supplied, e.g., `hobbs.res()`

. `gsl_nls`

uses
a numerical approximation for the Jacobian if the argument `jac`

is missing. Note function `nlsr::pnlslm()`

for a 1-line display of
the results of `minpack.lm::nls.lm()`

.

`hobnlfb<-nlfb(start=stu1, resfn=hobbs.res, jacfn=hobbs.jac)`

`hobnlfb -- ss= 2.5873 : b1 = 196.19 b2 = 49.092 b3 = 0.31357; 25 res/ 19 jac`

`hobnlm<-nls.lm(par=stu1, fn=hobbs.res, jac=hobbs.jac)`

`hobnlm -- ss= 2.5873 : b1 = 196.19 b2 = 49.092 b3 = 0.31357; 17 itns`

`hobgsln<-gsl_nls(start=stu1, fn=hobbs.res, y=rep(0,12))`

`hobgsln -- ss= 2.5873 : b1 = 196.19 b2 = 49.092 b3 = 0.31357; 25 itns`

`hobgsl<-gsl_nls(start=stu1, fn=hobbs.res, y=rep(0,12), jac=hobbs.jac)`

`hobgsl -- ss= 2.5873 : b1 = 196.19 b2 = 49.092 b3 = 0.31357; 25 itns`

The output object of `nlxb()`

is smaller than the class `nls`

object returned
by `nls()`

, `nlsLM()`

, and `gsl_nls()`

. Package `nlsr`

emphasizes the solution
of the nonlinear least squares problem rather than the estimation of a nonlinear
model that fits or explains the data. The object of class `nls`

allows for
a number of specialized modeling and diagnostic extensions. For compatibility,
the `nlsr`

package has the function `wrapnlsr()`

, for which `nlsr()`

is an alias.
This uses `nlxb()`

to find good parameters, then calls `nls()`

to return the class
`nls`

object. Unless particular modeling features are needed, the use of
`wrapnlsr()`

is unnecessary and wasteful of resources.

The design goals of the different tools may also be revealed in the so-called
“convergence tests” for the iterative solvers. In the manual page for `nls()`

in R 4.0.0, there was the warning:

> *Do not use nls on artificial “zero-residual” data*

with the suggested addition of small perturbations to the data. This admits

`nls()`

could not solve well-posed problems unless data is polluted
with errors. Zero-residual problems are not always artificial,
since problems in function approximation and nonlinear equations can be
approached with nonlinear least squares. Fortunately, a small adjustment
to the “termination test” for the program, rather than for the
“convergence” of the underlying algorithm, fixes the defect. The test
is the Relative Offset Convergence
Criterion (see Bates and Watts (1981)). This scales an estimated reduction in the loss
function by its current value. If the loss function is very small, we are close
to a zero-divide. Adding a small quantity to the divisor avoids trouble.
In 2021, one of us (J. Nash) proposed that `nls.control()`

have an
additional parameter `scaleOffset`

with a default value of zero. Setting it to a
small number – 1.0 is a reasonable choice – allows small-residual problems
(i.e., near-exact fits) to be dealt with easily. We call this the
`nlsr`

since it was introduced. The default value
gives the legacy behavior. This improvement has been in the R distributed code
since version 4.1.0.Additional termination tests can be used. `nlsr`

has a **small sum of
squares** test (**smallsstest**) that compares the latest evaluated sum of squared
(weighted) residuals to `e4`

times the initial sum of squares, where
`e4 <- (100*.Machine$double.eps)^4`

is approximately 2.43e-55.

Termination after what may be considered excessive computation is also important.
`nls()`

stops after `maxiter`

“iterations”. The meaning of “iteration” may require an
examination of the code for the different algorithms. `nlsr`

terminates execution
when the number of residual or Jacobian evaluations exceed set limits. Generally,
we prefer larger limits than the default `maxiter = 50`

of `nls()`

to avoid stopping
early, though this may result in some unnecessary computations.

As mentioned, the output of `nls()`

, `minpack.lm::nlsLM()`

, or
`gslnls::gsl_nls()`

is an object of class “nls” which has a quite rich structure
described in the manual files or revealed by applying the `str()`

function to the
result of `nls()`

. The complexity of this object is a challenge to users. Let us
use for example `result <- snlm1`

as the returned object from `nlsLM()`

for the
Logistic3S problem. The `data`

return element
is an R symbol. To actually access the data from this element, we need to use
the syntax:

`eval(parse(text = result$data))`

However, if the call is made with `model = TRUE`

, then there is a returned element
`model`

which contains the data, and we can list its contents using:

`ls(result$model)`

and if there is an element called `xdata`

, then it can be accessed as
`result$model$xdata`

.

By contrast, `nlsr::nlxb()`

returns a much simpler structure of 11 items
in one level. Moreover, `nlxb`

explicitly returns the sum of squares, the residual vector, Jacobian, and counts of evaluations.

Tools that produce a class `nls`

output object create a rich set of functions
and structures that are then used in a variety of modeling tasks, including the
least squares solution. By contrast,
`nlsr`

computes quantities as they are requested or needed, with additional
features in separate functions. For example, the singular values of the Jacobian
are actually computed in the `print`

and `summary`

methods for the result.
These two approaches
lead to different consequences for performance and how features are
provided. `nlsr`

has antecedents in the methods of Nash (1979), where storage for
data and programs was at a ridiculous premium in the small computers of the era.
Thus, the code in `nlsr`

is likely of value for workers to copy and modify
for customized tools.

Gauss-Newton/Marquardt methods all need a Jacobian matrix at each iteration.
By default, `nlsr::nlxb()`

will try to evaluate this using analytic
expressions using symbolic and automatic differentiation tools.
When using a formula specification of the model, `nls()`

, `minpack.lm::nlsLM()`

and `gslnls::gsl_nls()`

use a finite difference approximation to compute the
Jacobian, though `gsl_nls()`

does have an option to attempt symbolic expressions.
Package `nlsr`

provides, via appropriate calling syntax, four numeric approximation
options for the Jacobian, with a further control `ndstep`

for the size of the
step used in the approximation. These options allow programming choices to be
examined. Users can largely ignore them.

Using the “gradient” attribute of the output of the Jacobian function to hold
the Jacobian matrix lets us embed this in the residual function as well,
so that the call to `nlsr::nlfb()`

can be made with the same name used for
both residual and Jacobian function arguments. This programming trick saves a
lot of trouble for the package developer, but it can be a nuisance for users
trying to understand the code.

As far as we can understand the logic in `nls()`

, the Jacobian computation during
parameter estimation is carried out within the called C-language program
and its wrapper R code function `numericDeriv()`

, part of
`./src/library/stats/R/nls.R`

in the R distribution source code. This is used to provide Jacobian information in
the `nlsModel()`

and `nlsModel.plinear()`

functions, which are not exported for
general use. `gsl_nls()`

also appears to use `numericDeriv()`

.

`numericDeriv()`

uses a
simple forward difference approximation of derivatives, though a central
difference approximation can be specified in control parameters.
We are unclear why `numericDeriv()`

in base R calls `C_numeric_deriv`

,
as we were easily able to create a more compact version entirely in R.
See *https://github.com/nashjc/RNonlinearLS/tree/main/DerivsNLS*.

`minpack.lm::nlsLM()`

invokes `numericDeriv()`

in its local
version of `nlsModel()`

, but it appears to use an internal approximate Jacobian
code from
the original Fortran `minpack`

code, namely, `lmdif.f`

. Such differences in approach
can lead to different behavior, usually minor, but sometimes annoying with
ill-conditioned problems.

A pasture regrowth problem (Huet et al. (2004), page 1, based on Ratkowsky (1983)) has a poorly conditioned Jacobian and

`nls()`

fails with “singular gradient”. Worse, numerical approximation to the Jacobian gives the error “singular gradient matrix at initial parameter estimates” for`minpack.lm::nlsLM`

so that the Marquardt stabilization is unable to take effect, while the analytic derivatives of`nlsr::nlxb`

give a solution.Karl Schilling (private communication) provided an example where a model specified with the formula

`y ~ a * (x ˆ b)`

causes`nlsr::nlxb`

to fail because the partial derivative w.r.t.`b`

is`a * (x ^ b * log(x))`

. If there is data for which`x = 0`

, the derivative is undefined, but the model can be computed. In such cases, we observed that`nls()`

and`minpack.lm::nlsLM`

found a solution, though this seems to be a lucky accident.

Analytic Jacobian code can be provided to all the solvers discussed.
Most `selfStart`

models that automatically provide starting parameters
also include such code. There is documentation in R
of `selfStart`

models, but their construction is non-trivial. A number
of such models are included with base R in `./src/library/stats/R/zzModels.R`

,
with package `nlraa`

(Miguez (2021)) providing a richer set.
There are also some in the now-archived package `NRAIA`

.
These provide the Jacobian in the “gradient” attribute of the “one-sided” formula
that defines each model, and these Jacobians are often the analytic forms.

The `nls()`

function, after
computing the “right-hand side” or `rhs`

of the residual, checks to see if the
“gradient” attribute is defined, otherwise using `numericDeriv()`

to compute a
Jacobian into that attribute. This code is within the `nlsModel()`

or
`nlsModel.plinear()`

functions. The use of analytic Jacobians
almost certainly contributes to the good performance of `nls()`

on `selfStart`

models.

The use of `selfStart`

models with
`nlsr`

is described in the “Introduction to nlsr” vignette. However, since `nlsr`

generally can use very crude starting values, we have rarely needed them, though
it should be pointed out that our work is primarily diagnostic. If we were carrying
out a large number of similar estimations, such initial parameters are
critical to efficiency.

In considering `selfStart`

models, we noted that the base-R function `SSlogis`

is intended to solve problem **Logistic3T** above. When this function is used
via `getInitial()`

to find
starting values, it actually calls `nls()`

with the ‘plinear’ algorithm
and finds a (full) solution. It then passes the solution coefficients to the default
algorithm unnecessarily. Moreover, the implicit double call is, in our view, prone to creating
errors in code maintenance.
To provide simpler starting parameters, the function `SSlogisJN`

is now part of the package `nlsr`

, but is most useful for `nls()`

.

Users may also want to profit from the Jacobian code of `selfStart`

models
but supply explicit starting values other than those suggested by `getInitial()`

.
This does not appear to be possible with `nls()`

. `nlsr::nlxb()`

always requires
starting parameters, and can either use `getInitial()`

to find them from the `selfStart`

model or provide explicit values, but the formula used in the call to `nlxb()`

still involves the `selfStart`

function.

We are also surprised that the
analytic expressions for the Jacobian (“gradient”) in the `SSLogis`

function
and others save quantities in “hidden”
variables, i.e., with names preceded by “.”.
These are then not displayed by the `ls()`

command, making them difficult to access by users who may wish to create
their own `selfStart`

model via copy and edit.
Interactive tools, such as “visual fitting” (Nash and Velleman (1996)) might be
worth considering as another way to find starting parameters, but we know of no
R capability of this type.

As a side note, the introduction of `scaleOffset`

in R 4.1.1 to deal with the
convergence test for small residual problems now requires that the `getInitial()`

function have dot-arguments (`...`

) in its argument list. This illustrates the
entanglement of many features in `nls()`

that complicate its maintenance and
improvement.

For many problems, we know that parameters cannot be bigger or smaller than some externally known limits. Such limits should be built into models, but there are some important details for using the tools in R.

`nls()`

can only impose bounds if the`algorithm = "port"`

argument is used in the call. Unfortunately, the documentation warns us:*The algorithm = “port” code appears unfinished, and does not even check that the starting value is within the bounds. Use with caution, especially where bounds are supplied.*`gsl_nls()`

does not offer bounds.bounds are part of the default method for package

`nlsr`

.`nlsLM()`

includes bounds in the standard call, but we have observed cases where it fails to get the correct answer. From an examination of the code, we believe the authors have not taken into account all possibilities, though all programs have some weakness regarding constrained optimization in that programmers have to work with assumptions on the scale of numbers, and some problems will be outside the scope envisaged.

```
# Start MUST be feasible i.e. on or within bounds
anlshob1b <- try(nls(frms, start = sts1, data = weeddf, lower = c(0,0,0),
upper = c(2,6,3), algorithm = 'port'))
if (! inherits(anlshob1b, "try-error")) pnls(anlshob1b) # check the answer (short form)
```

`anlshob1b -- ss= 9.4726 : c1 = 2 c2 = 4.4332 c3 = 3; 12 itns`

```
# nlsLM seems NOT to work with bounds in this example
anlsLM1b <- try(nlsLM(frms, start = sts1, data = weeddf, lower = c(0,0,0), upper = c(2,6,3)))
if (! inherits(anlsLM1b, "try-error")) pnls(anlsLM1b)
```

`anlsLM1b -- ss= 881.02 : c1 = 2 c2 = 6 c3 = 3; 2 itns`

```
# also no warning if starting out of bounds, but gets a good answer!!
st4 <- c(c1 = 4, c2 = 4, c3 = 4)
anlsLMob <- try(nlsLM(frms, start = st4, data = weeddf, lower = c(0,0,0), upper = c(2,6,3)))
if (! inherits(anlsLMob, "try-error")) pnls(anlsLMob)
```

`anlsLMob -- ss= 9.4726 : c1 = 2 c2 = 4.4332 c3 = 3; 4 itns`

```
# Try nlsr::nlxb()
anlx1b <- try(nlxb(frms, start = sts1, data = weeddf, lower = c(0,0,0), upper = c(2,6,3)))
if (! inherits(anlx1b, "try-error")) pshort(anlx1b)
```

`anlx1b -- ss= 9.4726 : c1 = 2 c2 = 4.4332 c3 = 3; 12 res/ 12 jac`

Bounds on parameters raise some interesting questions about how
uncertainty in parameter estimates should be computed or reported. That is, the
traditional “standard errors” are generally taken to imply symmetric intervals about
the point estimate in which the parameter may be expected to be found with some
probability under certain assumptions.
Bounds change those assumptions and hence the interpretation of estimates of
uncertainty, whether by traditional approximations from the \(J' J\) matrix or from methods
such as profile likelihood or bootstrap. At the time of writing,
`nlsr::nlxb()`

does not compute standard errors nor their derived statistics
when bounds are active to avoid providing misleading information.

Let us try to fix (mask) the first parameter in the first two example problems.

```
# Hobbsmaskx.R -- masks with formula specification of the problem
require(nlsr); require(minpack.lm); traceval <- FALSE
stu <- c(b1 = 200, b2 = 50, b3 = 0.3) # a default starting vector (named!)
sts <- c(c1 = 2, c2 = 5, c3 = 3) # a default scaled starting vector (named!)
# fix first parameter
anxbmsk1 <- try(nlxb(frmu, start = stu, data = weeddf, lower = c(200, 0, 0),
upper = c(200, 60, 3), trace=traceval))
if (! inherits(anxbmsk1, "try-error")) print(anxbmsk1)
```

```
residual sumsquares = 2.6182 on 12 observations
after 4 Jacobian and 4 function evaluations
name coeff SE tstat pval gradient JSingval
b1 200U M NA NA NA 0 NA
b2 49.5108 1.12 44.21 8.421e-13 -2.887e-07 1022
b3 0.311461 0.002278 136.8 1.073e-17 0.0001635 0.4569
```

```
anlM1 <- try(nlsLM(frmu, start = stu, data = weeddf, lower = c(200, 0, 0),
upper=c(200, 60, 3), trace = traceval))
if (! inherits(anlM1, "try-error")) pnls(anlM1)
```

`anlM1 -- ss= 2.6182 : b1 = 200 b2 = 49.511 b3 = 0.31146; 4 itns`

```
anlsmsk1 <- try(nls(frmu, start = stu, data = weeddf, lower = c(200, 0, 0),
upper = c(200, 60, 3), algorithm = "port", trace = traceval))
if (! inherits(anlsmsk1, "try-error")) pnls(anlsmsk1)
```

`anlsmsk1 -- ss= 2.6182 : b1 = 200 b2 = 49.511 b3 = 0.31146; 5 itns`

```
# Hobbs scaled problem with bounds, formula specification
anlxmsks1 <- try(nlxb(frms, start = sts, data = weeddf, lower = c(2, 0, 0),
upper = c(2, 6, 30)))
if (! inherits(anlxmsks1, "try-error")) print(anlxmsks1)
```

```
residual sumsquares = 2.6182 on 12 observations
after 4 Jacobian and 4 function evaluations
name coeff SE tstat pval gradient JSingval
c1 2U M NA NA NA 0 NA
c2 4.95108 0.112 44.21 8.421e-13 -2.981e-06 104.2
c3 3.11461 0.02278 136.8 1.073e-17 1.583e-05 4.482
```

```
anlshmsk1 <- try(nls(frms, start = sts, trace = traceval, data = weeddf,
lower = c(2, 0, 0), upper = c(2, 6, 30), algorithm = 'port'))
if (! inherits(anlshmsk1, "try-error")) pnls(anlshmsk1)
```

`anlshmsk1 -- ss= 2.6182 : c1 = 2 c2 = 4.9511 c3 = 3.1146; 5 itns`

```
anlsLMmsks1 <- try(nlsLM(frms, start = sts, data = weeddf, lower = c(2,0,0),
upper = c(2,6,30)))
if (! inherits(anlsLMmsks1, "try-error")) pnls(anlsLMmsks1)
```

`anlsLMmsks1 -- ss= 2.6182 : c1 = 2 c2 = 4.9511 c3 = 3.1146; 4 itns`

```
# Test with all parameters masked
anlxmskall <- try(nlxb(frms, start=sts, data=weeddf, lower=sts, upper=sts))
if (! inherits(anlxmskall, "try-error")) print(anlxmskall)
```

```
residual sumsquares = 158.23 on 12 observations
after 0 Jacobian and 1 function evaluations
name coeff SE tstat pval gradient JSingval
c1 2U M NA NA NA NA NA
c2 5U M NA NA NA NA NA
c3 3U M NA NA NA NA NA
```

`nlsr`

has an output format that indicates the constraint status of the parameter
estimates. For `nlsr`

, we have chosen to suppress the calculation of
approximate standard errors in the parameters when constraints are active because
their meaning under constraints is unclear, though we believe this
policy worthy of discussion and further investigation.

All four major tools illustrated solve some variant of the Gauss-Newton
equations. `nls()`

uses a modification of an approach suggested by Hartley (1961),
while `nlsr`

, `gslnls`

, and `minpack.lm`

use flavors of Marquardt (1963). `gslnls`

offers an accelerated Marquardt method and three alternative methods which we
have not investigated.
Control settings for `nlxb()`

or `nlfb()`

allow exploration of Hartley and Marquardt
algorithm variants. In general, the Levenberg-Marquardt stabilization
is important in obtaining
solutions in methods of the Gauss-Newton family, as `nls()`

terminates
too frequently and unnecessarily with `singular gradient`

errors.

An important choice made in developing `nlsr`

was to code entirely within the R
programming language. `nls()`

uses a mix of R, C, and Fortran, as does `minpack.lm`

.
`gslnls`

is an R wrapper to various C-language routines in the GNU Scientific
Library (Galassi et al. (2009)).
Generally, keeping to a single programming language can allow for
easier maintenance and upgrades. It also avoids some work when there are changes
or upgrades to libraries for the non-R languages.
R is usually considered slower
than most computing environments because it keeps track of objects and
because it is usually interpreted. In recent years, the
performance penalty for using code entirely in R has been much reduced
with the just-in-time compiler and other improvements. All-R computation
may now offer acceptable performance. In `nlsr`

, the use of R may be a
smaller performance cost than the aggressive approach to a solution,
which can cause more iterations to be used.

`nls()`

can be called without specifying the `data`

argument. In this case, it will
search in the available environments (i.e., workspaces) for suitable data objects.
We do not like this approach, but it is “the R way”. R allows users to leave many
objects in the default (`.GlobalEnv`

) workspace. Moreover, users have to actively
suppress saving this workspace (`.RData`

) on exit, otherwise any such file in
the path will be loaded on startup. R users in our
acquaintance avoid saving the workspace because of lurking data and
functions that may cause unwanted results.

`nls()`

and other class `nls`

tools accept an argument `subset`

. This acts through
the mediation of `model.frame`

, which is not obvious in the
source code files
`/src/library/stats/R/nls.R`

and

`/src/library/stats/src/nls.C`

.
Having `subset`

at the level of the call to a function like
`nls()`

saves effort, but it does mean that the programmer of the
solver needs to be aware of the
origin (and value) of objects such as the data, residuals and Jacobian.
By preference,
we would implement subsetting by zero-value weights, with observation counts
(and degrees of freedom) computed via the numbers of non-zero weights. Alternatively,
we would extract a working dataframe from the relevant elements in the original.

`na.action`

is an argument to the `nls()`

function, but it does not appear
obviously in the source code, often being handled behind the scenes after
referencing the option `na.action`

. This feature also changes the data
supplied to our nonlinear least squares solver.
A useful, but possibly dated, description is given in:
*https://stats.idre.ucla.edu/r/faq/how-does-r-handle-missing-values/*.
The typical default action, which can be seen by using the command
`getOption("na.action")`

is `na.omit`

. This option removes from computations any observations
containing missing values (i.e. any row of a data frame containing an NA).
`na.exclude`

does much of the same for solver computations, but keeps the rows with
NA elements so that predictions are in the correct row position. We recommend that
workers test output to verify the behavior is as wanted.
See *https://stats.stackexchange.com/questions/492955/should-i-use-na-omit-or-na-exclude-in-a-linear-model-in-r*.
As with `subset`

, our concern with `na.action`

is that users may be unaware of the
effects of an option they may not even be aware has been set. Should `na.fail`

be the default?

`model`

is an argument to the `nls()`

and related functions, which is documented as:

modellogical. If true, the model frame is returned as part of the object. Default is FALSE.

Indeed, the argument only gets used when `nls()`

is about to return its result
object, and the
element `model`

is NULL unless the calling argument `model`

is TRUE. Using the
same name for both function argument and object element could be confusing.
Despite this, the model frame is used within the function code in
the form of the object
`mf`

. We feel that users could benefit from more extensive documentation and
examples of its use since it is used to implement features like `subset`

.

All four main tools we consider here allow a `weights`

argument that
specifies a vector of fixed weights the same length as the number of residuals.
Each residual is multiplied by the square root of the corresponding weight.
Where available, the values returned by the
`residuals()`

function are weighted, and the `fitted()`

or `predict()`

function are
used to compute raw residuals.

While fixed weights may be useful, there are many problems for which
we want weights that are determined at least partially from the model
parameters, for
example, a measure of the standard deviation of observations.
Such dynamic weighting situations are discussed in the vignette
“Introduction to nlsr” of package `nlsr`

in section
*Weights that are functions of the model parameters*.
`minpack.lm`

offers a function `wfct()`

to
facilitate such weighting. Care is advised
in applying such ideas.

The function `resid()`

(an alias for `residuals()`

) gives weighted residuals
like `result$m$resid()`

. The function `nlsModel()`

allows us to compute residuals for particular coefficient sets.
We have had to extract `nlsModel()`

from the base R code and include it via a code chunk
(not echoed here for space) because it is not exported to the working namespace,
We could also explicitly `source()`

this code.

```
wts <- 0.5 ^ tt # simple weights
frmlogis <- weed ~ Asym / (1 + exp((xmid - tt)/scal))
Asym <- 1; xmid <- 1; scal <- 1
nowt <- try(nls(weed ~ SSlogis(tt, Asym, xmid, scal))) # UNWEIGHTED
if (! inherits(nowt, "try-error")) {
rnowt <- nowt$m$resid() # This has UNWEIGHTED residual and Jacobian. Does NOT take coefficients.
attr(rnowt, "gradient") <- NULL
} else rnowt <- NULL
rnowt
```

```
[1] -0.011900 0.032755 -0.092030 -0.208782 -0.392634 0.057594
[7] 1.105728 -0.715786 0.107647 0.348396 -0.652592 0.287569
```

```
usewt <- try(nls(weed ~ SSlogis(tt, Asym, xmid, scal), weights = wts))
if (! inherits(usewt, "try-error")) {
rusewt <- usewt$m$resid() # WEIGHTED. Does NOT take coefficients.
attr(rusewt, "gradient") <- NULL
} else rusewt <- NULL
rusewt
```

```
[1] 0.0085640 0.0324442 -0.0176652 -0.0388479 -0.0579575 0.0163623
[7] 0.1042380 -0.0411766 0.0052509 0.0084324 -0.0194246 -0.0024053
```

```
## source("nlsModel.R") # or use {r nlsmodelsource, echo=FALSE} code chunk
nmod0 <- nlsModel(frmlogis, data = weeddf, start = c(Asym = 1, xmid = 1, scal = 1), wts = wts)
rn0 <- nmod0$resid() # Parameters are supplied in nlsModel() `start` above.
attr(rn0, "gradient") <- NULL; rn0 # weighted residuals at starting coefficients
```

```
[1] 3.3998 3.2545 3.0961 2.9784 2.8438 2.7748 2.6910 2.3474 2.1724
[10] 1.9359 1.6572 1.4214
```

```
nmod <- nlsModel(frmlogis, data = weeddf, start = coef(usewt), wts = wts)
rn <- nmod$resid()
attr(rn,"gradient")<-NULL; rn # same as rusewt
```

```
[1] 0.0085640 0.0324442 -0.0176652 -0.0388479 -0.0579575 0.0163623
[7] 0.1042380 -0.0411766 0.0052509 0.0084324 -0.0194246 -0.0024053
```

As the `nls()`

**man** page states, when the “port” algorithm is used with the
`trace`

argument TRUE, the iterations display the objective function value which
is 1/2 the sum of squares (or deviance). The trace display is likely
embedded in the Fortran of the `nlminb`

routine that is called to execute
the “port” algorithm, but the factor of 2 discrepancy is nonetheless
unfortunate for users.

If `nls()`

reaches a point where it cannot continue but has not found a point
where the relative offset convergence criterion is met, it may simply exit,
especially if a “singular gradient” (singular Jacobian) is found. However,
this may occur AFTER the function has made considerable progress in reducing
the sum of squared residuals.
Here is an abbreviated example:

```
time <- c( 1, 2, 3, 4, 6 , 8, 10, 12, 16)
conc <- c( 0.7, 1.2, 1.4, 1.4, 1.1, 0.8, 0.6, 0.5, 0.3)
NLSdata <- data.frame(time,conc)
NLSstart <- c(lrc1 = -2, lrc2 = 0.25, A1 = 150, A2 = 50) # a starting vector (named!)
NLSformula <- conc ~ A1 * exp(-exp(lrc1) * time) + A2 * exp(-exp(lrc2) * time)
tryit <- try(nls(NLSformula, data = NLSdata, start = NLSstart, trace = TRUE))
```

```
61216. (3.56e+03): par = (-2 0.25 150 50)
2.1757 (2.23e+01): par = (-1.9991 0.31711 2.6182 -1.3668)
1.6211 (7.14e+00): par = (-1.9605 -2.6203 2.5753 -0.55599)
Error in nls(NLSformula, data = NLSdata, start = NLSstart, trace = TRUE) :
singular gradient
```

Note that the sum of squares has been reduced from 61216 to 1.6211, but
unless `trace`

is invoked, the user will not get any information about this.
Changing this in the `nls()`

function could be useful to R users, and would
be almost trivial.

The variable projection method (Golub and Pereyra (1973), O’Leary and Rust (2013)) is
usually much more effective than general approaches in finding good solutions
to nonlinear least squares
problems when some of the parameters appear linearly. In our logistic examples,
the asymptote parameters are an illustration. However, identifying which parameters
are linear and communicating this information to estimating functions
is not a trivial task. `nls()`

has an option `algorithm = "plinear"`

that allows some partially linear
models to be solved. The other tools, as far as we are aware, do not offer any
such capability. The `nlstac`

package uses a different algorithm for similar goals.

Within `nls()`

itself we must, unfortunately, use
different specifications with different `algorithm`

options.
For example, the explicit model `y ~ a * x + b`

does not work with the linear
modeling function `lm()`

, which requires this model to be specified as `y ~ x`

.
Within `nls()`

, consider the following FOUR different specifications for the same
problem, plus an intuitive choice, labeled `fm2a`

, that does not work.
In this failed attempt, putting the `Asym`

parameter in the model causes the
`plinear`

algorithm
to try to add another term to the model. We believe this is unfortunate, and would
like to see a consistent syntax. At the time of writing, we do
not foresee a resolution for this issue. In the example, we have not evaluated
the commands to save space.

```
DNase1 <- subset(DNase, Run == 1) # select the data
## using a selfStart model - do not specify the starting parameters
fm1 <- try(nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1))
if (! inherits(fm1, "try-error")) summary(fm1)
## using conditional linearity - leave out the Asym parameter
fm2 <- try(nls(density ~ 1 / (1 + exp((xmid - log(conc)) / scal)),
data = DNase1, start = list(xmid = 0, scal = 1),
algorithm = "plinear"))
if (! inherits(fm2, "try-error")) summary(fm2)
## without conditional linearity
fm3 <- try(nls(density ~ Asym / (1 + exp((xmid - log(conc)) / scal)),
data = DNase1,
start = list(Asym = 3, xmid = 0, scal = 1)))
if (! inherits(fm3, "try-error")) summary(fm3)
## using Port's nl2sol algorithm
fm4 <- try(nls(density ~ Asym / (1 + exp((xmid - log(conc)) / scal)),
data = DNase1, start = list(Asym = 3, xmid = 0, scal = 1),
algorithm = "port"))
if (! inherits(fm4, "try-error")) summary(fm4)
## using conditional linearity AND Asym does not work
fm2a <- try(nls(density ~ Asym / (1 + exp((xmid - log(conc)) / scal)),
data = DNase1, start = list(Asym=3, xmid = 0, scal = 1),
algorithm = "plinear", trace = TRUE))
if (! inherits(fm2a, "try-error")) summary(fm2a)
```

Some models have several common parameters and
others that are tied to particular cases.
The **man** file for `nls()`

includes an example of a situation in which
parameters are indexed but which uses the “plinear” option as an added complication.
Running this example reveals that the answers for the parameters are not indexed
as in a vector.
That is, we do not see `a[1], a[2], a[3]`

but `a1, a2, a3`

.
This is no doubt because programming for indexed parameters is challenging.
We note that there are capabilities in packages for mixed-effects modeling
such as `nlme`

(Pinheiro et al. (2013)), `bbmle`

(Bolker and Team (2013)), and `lme4`

(Bates et al. (2015)) for
estimating models of such type.

Maintainers of packages need suitable tests and use-case examples in order

- to ensure packages work properly, in particular, giving results comparable to or better than the functions they are to replace.
- to test individual solver functions to ensure they work across the range of calling mechanisms, that is, different ways of supplying inputs to the solver(s);
- to pose “silly” inputs to see if these bad inputs are caught by the programs.

Such goals align with the aims of **unit testing**
(e.g., *https://towardsdatascience.com/unit-testing-in-r-68ab9cc8d211*, Wickham (2011), Wickham et al. (2021)
and the conventional R package testing tools).
In our work, one of us (AB) has developed a working prototype package at *https://github.com/ArkaB-DS/nlsCompare* .
A primary design objective of this is to allow the summarization of multiple tests in
a compact output. The prototype has a vignette to illustrate its use.

In our investigation, we built several resources, which are now part of
the repository *https://github.com/nashjc/RNonlinearLS/*. Particular items
are:

A BibTex bibliography for use with all documents in this project, but which has wider application to nonlinear least squares projects in general (

*https://github.com/nashjc/RNonlinearLS/blob/main/BibSupport/ImproveNLS.bib*).`MachID.R`

offers a suggested concise summary function to identify a particular computational system used for tests. A discussion of how it was built and the resources needed across platforms is given in at*https://github.com/nashjc/RNonlinearLS/tree/main/MachineSummary*.Nash and Bhattacharjee (2022) is an explanation of the construction of the

`nlspkg`

from the`nls()`

code in R-base.As the 2021 Summer of Code period was ending, one of us (JN) was invited to prepare a review of optimization in R. Ideas from the present work have been instrumental in the creation of Nash (2022).

Given its importance to R, it is possible that `nls()`

will remain more or less
as it has been for the past several decades. If so, the focus of discussion
should be the measures needed to secure its continued operation for legacy
purposes and how that may be accomplished.
We welcome an opportunity to participate in such conversations.

To advance the stability and maintainability of R, we believe the program
objects (R functions)
that are created by tools such as `nls()`

should have minimal
cross-linkages and side-effects.
The aspects of `nls()`

that concern us follow.

R tools presume data and parameters needed are available in an accessible environment. This provides a compact syntax to invoke the calculations, but the wrong data can be used if the internal search finds a valid name that is not the object we want, or if

`subset`

or`na.action`

settings modify the selection, or`weights`

are applied.Mixing of R, C and Fortran code adds to the burden of following the program logic.

The class

`nls`

structure simplifies calls with its rich set of functions, but also adds to the task of understanding what has been done. A design that isolates the setup, solution, and post-solution parts of complicated calculations reduces the number of objects that must be kept in alignment.

Given the existence of examples of good practices such as analytic derivatives, stabilized solution of Gauss-Newton equations, and bounds-constrained parameters, base R tools should be moving to incorporate them. These capabilities are available now by using several tools, but it would be helpful if they were unified.

Hans Werner Borchers was helpful in developing the GSoC project motivating this article
and in comments on this and related work. Heather Turner co-mentored the project
and helped guide the progress of the work. Exchanges with Fernando Miguez helped to
clarify aspects of `selfStart`

models and instigated the “Introduction to nlsr”
vignette. Colin Gillespie (package `benchmarkme`

) has been helpful in guiding our
attempts to succinctly summarize computing environments. We thank one of the referees
for pointing out that it is important to use background knowledge and graphs to
guide modeling. Simon Urbanek gave some important practical help with preparing
files for publication.

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-091.zip

D. M. Bates. *NISTnls: Nonlinear least squares examples from NIST.* 2012. URL https://CRAN.R-project.org/package=NISTnls. R package version 0.9-13. Original source from National Institutes for Standards and Technology https://www.itl.nist.gov/div898/strd/nls/nls_main.shtml with R port by Douglas Bates.

D. M. Bates and D. G. Watts. A Relative Offset Orthogonality Convergence Criterion for Nonlinear least Squares. *Technometrics*, 23(2): 179–183, 1981.

D. M. Bates and D. G. Watts. *Nonlinear Regression Analysis and Its Applications.* Wiley, 1988.

D. Bates, M. Mächler, B. Bolker and S. Walker. Fitting linear mixed-effects models using lme4. *Journal of Statistical Software*, 67(1): 1–48, 2015. DOI 10.18637/jss.v067.i01.

F. Baty and M.-L. Delignette-Muller. *nlsMicrobio: Data sets and nonlinear regression models dedicated to predictive microbiology.* 2014. R package version 0.0-1.

F. Baty and M.-L. Delignette-Muller. *Nlstools: Tools for nonlinear regression diagnostics.* 2013. R package version 0.0-15.

B. Bolker and R. D. C. Team. *Bbmle: Tools for general maximum likelihood estimation.* 2013. URL http://CRAN.R-project.org/package=bbmle. R package version 1.0.13.

J. Chau. *gslnls: GSL Nonlinear Least-Squares Fitting.* 2023. URL https://CRAN.R-project.org/package=gslnls. R package version 1.1.2.

R. Duursma. *Nlshelper: Convenient functions for non-linear regression.* 2017. URL https://CRAN.R-project.org/package=nlshelper. R package version 0.2.

T. V. Elzhov, K. M. Mullen, A.-N. Spiess and B. Bolker. *minpack.lm: R interface to the Levenberg-Marquardt nonlinear least-squares algorithm found in MINPACK, plus support for bounds.* R Project for Statistical Computing, 2012. URL http://CRAN.R-project.org/package=minpack.lm. R package version 1.1-6.

M. Galassi, J. Davies, J. Theiler, B. Gough and G. Jungman. *GNU Scientific Library - Reference Manual, Third Edition, for GSL Version 1.12 (3. ed.).* Free Software Foundation, 2009.

A. R. Gallant. *Nonlinear statistical models.* Wiley, 1987.

R. C. Gentleman, V. J. Carey, D. M. Bates, B. Bolstad, M. Dettling, S. Dudoit, B. Ellis, L. Gautier, Y. Ge, J. Gentry, et al. Bioconductor: Open software development for computational biology and bioinformatics. *Genome Biology*, 5(R80): 2004. DOI 10.1186/gb-2004-5-10-r80.

G. H. Golub and V. Pereyra. The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate. *SIAM Journal of Numerical Analysis*, 10(2): 413–432, 1973.

G. Grothendieck. *nls2: Non-linear regression with brute force.* 2022. URL https://CRAN.R-project.org/package=nls2. R package version 0.3-3.

H. O. Hartley. The modified gauss-newton method for the fitting of non-linear regression functions by least squares. *Technometrics*, 3: 269–280, 1961.

S. Huet, A. Bouvier, M.-A. Poursat and E. Jolivet. *Statistical tools for nonlinear regression: A practical guide with S-PLUS examples, 2nd edition.* Berlin & New York: Springer-Verlag, 2004.

Josef Tvrdík and Ivan Křivý and Ladislav Mišík. Adaptive population-based search: Application to estimation of nonlinear regression parameters. *Computational Statistics & Data Analysis*, 52(2): 713–724, 2007. URL https://doi.org/10.1016/j.csda.2006.10.014.

K. Levenberg. A method for the solution of certain non-linear problems in least squares. *Quarterly of Applied Mathematics*, 2: 164–168, 1944.

D. W. Marquardt. An Algorithm for Least-Squares Estimation of Nonlinear Parameters. *SIAM Journal on Applied Mathematics*, 11(2): 431–441, 1963.

F. Miguez. *nlraa: Nonlinear Regression for Agricultural Applications.* 2021. URL https://CRAN.R-project.org/package=nlraa. R package version 0.89.

J. C. Nash. *Compact numerical methods for computers: Linear algebra and function minimisation.* Bristol: Adam Hilger, 1979. Second Edition, 1990, Bristol: Institute of Physics Publications.

J. C. Nash. Function minimization and nonlinear least squares in R. *WIREs Computational Statistics*, 14(e1580): 2022. URL https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/wics.1580.

J. C. Nash. Minimizing a nonlinear sum of squares function on a small computer. *Journal of the Institute for Mathematics and its Applications*, 19: 231–237, 1977. JNfile: 77IMA J Appl Math-1977-NASH-231-7.pdf.

J. C. Nash and A. Bhattacharjee. Making a package from base R files. *R-bloggers*, 2022. URL https://www.r-bloggers.com/2022/01/making-a-package-from-base-r-files/.

J. C. Nash and D. Murdoch. *nlsr: Functions for Nonlinear Least Squares Solutions.* 2023. R package version 2023.8.31.

J. C. Nash and R. Varadhan. *Optimx*: A replacement and extension of the optim() function. Nash Information Services Inc.; Johns Hopkins University, 2011. R package version 2011-3.5.

J. C. Nash and P. Velleman. Nonlinear estimation combining visual fitting with optimization methods. In *Proceedings of the section on physical and engineering sciences of the american statistical association*, pages. 256–261 1996. American Statistical Association. Presented to the 1995 Spring Research Conference on Industrial Statistics, June 1995 in Waterloo, Ontario. JNfile: 96NashVellemanSRC95x.pdf, slides: 95SRC-nashvellemanvisualfittingslides.pdf.

J. C. Nash and M. Walker-Smith. *Nonlinear Parameter Estimation: An Integrated System in BASIC.* New York: Marcel Dekker, 1987. See http://www.nashinfo.com/nlpe.htm for a downloadable version of this plus some extras.

D. P. O’Leary and B. W. Rust. Variable projection for nonlinear least squares problems. *Computational Optimization and Applications*, 54(3): 579–593, 2013.

D. Padfield and G. Matheson. *nls.multstart: Robust Non-Linear Regression using AIC Scores.* 2020. URL https://CRAN.R-project.org/package=nls.multstart. R package version 1.2.0.

J. Pinheiro, D. Bates, S. DebRoy, D. Sarkar and R Core Team. *Nlme: Linear and nonlinear mixed effects models.* 2013. R package version 3.1-111.

M. Prates, V. Lachos and A. Garay. *nlsmsn: Fitting Nonlinear Models with Scale Mixture of Skew-Normal Distributions.* 2021. URL https://CRAN.R-project.org/package=nlsmsn. R package version 0.0-6.

D. A. Ratkowsky. *Nonlinear regression modeling: A unified practical approach.* New York; Basel: Marcel Dekker Inc., 1983.

M. Rodriguez-Arias, J. A. Fernandez, J. Cabello and R. Benitez. *Nlstac: An r package for fitting separable nonlinear models.* 2020. URL https://CRAN.R-project.org/package=nlstac. R package version 0.1.0.

G. J. S. Ross. *Nonlinear estimation.* New York etc.: Springer-Verlag, 1990.

G. A. F. Seber and C. J. Wild. *Nonlinear regression.* New York: Wiley, 1989.

S. Sokol. *Nlsic: Non linear least squares with inequality constraints.* 2022. URL https://CRAN.R-project.org/package=nlsic. R package version 1.0.1.

Spiess, Andrej-Nikolai. *Onls: Orthogonal nonlinear least-squares regression.* 2022. URL https://CRAN.R-project.org/package=onls. R package version 0.1-2.

J. A. F. Torvisco, M. Rodriguez-Arias and J. C. Sanchez. A new algorithm to fit exponential decays without initial guess. Faculty of Sciences and Mathematics, University of Nis, Serbia. 2018. DOI 10.2298/fil1812233t.

J. Tvrdík. *crsnls: Nonlinear Regression Parameters Estimation by ’CRS4HC’ and ’CRS4HCe’.* 2016. URL https://CRAN.R-project.org/package=crsnls. R package version 0.2.

H. Wickham. testthat: Get Started with Testing. *The R Journal*, 3: 5–10, 2011. URL https://journal.r-project.org/archive/2011-1/RJournal\_2011-1\_Wickham.pdf.

H. Wickham, J. Hester, W. Chang and J. Bryan. *devtools: Tools to Make Developing R Packages Easier.* 2021. URL https://CRAN.R-project.org/package=devtools. R package version 2.4.3.

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

For attribution, please cite this work as

Nash & Bhattacharjee, "A Comparison of R Tools for Nonlinear Least Squares Modeling", The R Journal, 2024

BibTeX citation

@article{RJ-2023-091, author = {Nash, John C. and Bhattacharjee, Arkajyoti}, title = {A Comparison of R Tools for Nonlinear Least Squares Modeling}, journal = {The R Journal}, year = {2024}, note = {https://doi.org/10.32614/RJ-2023-091}, doi = {10.32614/RJ-2023-091}, volume = {15}, issue = {4}, issn = {2073-4859}, pages = {198-215} }