Changes in R

The ‘Changes in R’ article from the 2018-1 issue.

R Core Team
2018-06-01

CHANGES IN R 3.5.0 patched

BUG FIXES

  • file("stdin") is no longer considered seekable.

  • dput() and dump() are no longer truncating when options(deparse.max.lines = *) is set.

  • Calls with an S3 class are no longer evaluated when printed, fixing part of PR#17398, thanks to a patch from Lionel Henry.

  • Allow file argument of Rscript to include space even when it is first on the command line.

  • callNextMethod() uses the generic from the environment of the calling method. Reported by Hervé Pagès with well documented examples.

  • Compressed file connections are marked as blocking.

  • optim(*, lower = c(-Inf, -Inf)) no longer warns (and switches the method), thanks to a suggestion by John Nash.

  • predict(fm, newdata) is now correct also for models where the formula has terms such as splines::ns(..) or stats::poly(..), fixing PR#17414, based on a patch from Duncan Murdoch.

  • simulate.lm(glm(*, gaussian(link = <non-default>))) has been corrected, fixing PR#17415 thanks to Alex Courtiol.

  • unlist(x) no longer fails in some cases of nested empty lists. Reported by Steven Nydick.

  • qr.coef(qr(<all 0, w/ colnames>)) now works. Reported by Kun Ren.

  • The radix sort is robust to vectors with >1 billion elements (but long vectors are still unsupported). Thanks to Matt Dowle for the fix.

  • Terminal connections (e.g., stdin) are no longer buffered. Fixes PR#17432.

  • deparse(x), dput(x) and dump() now respect c()’s argument names recursive and use.names, e.g., for x <- setNames(0, "recursive"), thanks to Suharto Anggono’s PR#17427.

  • Unbuffered connections now work with encoding conversion. Reported by Stephen Berman.

  • .Renviron on Windows with Rgui is again by default searched for in user documents directory when invoked via the launcher icon. Reported by Jeroen Ooms.

  • printCoefmat() now also works with explicit right=TRUE.

  • print.noquote() now also works with explicit quote=FALSE.

  • The default method for pairs(.., horInd=*, verInd=*) now gets the correct order, thanks to reports by Chris Andrews and Gerrit Eichner. Additionally, when horInd or verInd contain only a subset of variables, all the axes are labeled correctly now.

  • agrep("..|..", .., fixed=FALSE) now matches when it should, thanks to a reminder by Andreas Kolter.

  • str(ch) now works for more invalid multibyte strings.

CHANGES IN R 3.5.0

SIGNIFICANT USER-VISIBLE CHANGES

  • All packages are by default byte-compiled on installation. This makes the installed packages larger (usually marginally so) and may affect the format of messages and tracebacks (which often exclude .Call and similar).

NEW FEATURES

UTILITIES

  • install.packages() for source packages now has the possibility to set a ‘timeout’ (elapsed-time limit). For serial installs this uses the timeout argument of system2(): for parallel installs it requires the timeout utility command from GNU coreutils.

  • It is now possible to set ‘timeouts’ (elapsed-time limits) for most parts of R CMD check via environment variables documented in the ‘R Internals’ manual.

  • The ‘BioC extra’ repository which was dropped from Bioconductor 3.6 and later has been removed from setRepositories(). This changes the mapping for 6–8 used by setRepositories(ind=).

  • R CMD check now also applies the settings of environment variables _R_CHECK_SUGGESTS_ONLY_ and _R_CHECK_DEPENDS_ONLY_ to the re-building of vignettes.

  • R CMD check with environment variable _R_CHECK_DEPENDS_ONLY_ set to a true value makes test-suite-management packages available and (for the time being) works around a common omission of rmarkdown from the VignetteBuilder field.

INSTALLATION on a UNIX-ALIKE

  • Support for a system Java on macOS has been removed — install a fairly recent Oracle Java (see ‘R Installation and Administration’ §C.3.2).

  • configure works harder to set additional flags in SAFE_FFLAGS only where necessary, and to use flags which have little or no effect on performance.

    In rare circumstances it may be necessary to override the setting of SAFE_FFLAGS.

  • C99 functions expm1, hypot, log1p and nearbyint are now required.

  • configure sets a -std flag for the C++ compiler for all supported C++ standards (e.g., -std=gnu++11 for the C++11 compiler). Previously this was not done in a few cases where the default standard passed the tests made (e.g. clang 6.0.0 for C++11).

C-LEVEL FACILITIES

  • ‘Writing R Extensions’ documents macros MAYBE_REFERENCED, MAYBE_SHARED and MARK_NOT_MUTABLE that should be used by package C code instead NAMED or SET_NAMED.

  • The object header layout has been changed to support merging the ALTREP branch. This requires re-installing packages that use compiled code.

  • ‘Writing R Extensions’ now documents the R_tryCatch, R_tryCatchError, and R_UnwindProtect functions.

  • NAMEDMAX has been raised to 3 to allow protection of intermediate results from (usually ill-advised) assignments in arguments to BUILTIN functions. Package C code using SET_NAMED may need to be revised.

DEPRECATED AND DEFUNCT

  • Sys.timezone(location = FALSE) is defunct, and is ignored (with a warning).

  • methods:::bind_activation() is defunct now; it typically has been unneeded for years.

    The undocumented ‘hidden’ objects .__H__.cbind and .__H__.rbind in package base are deprecated (in favour of cbind and rbind).

  • The declaration of pythag() in Rmath.h has been removed — the entry point has not been provided since R 2.14.0.

BUG FIXES

  • printCoefmat() now also works without column names.

  • The S4 methods on Ops() for the "structure" class no longer cause infinite recursion when the structure is not an S4 object.

  • nlm(f, ..) for the case where f() has a "hessian" attribute now computes \(LL' = H + \mu I\) correctly. (PR#17249).

  • An S4 method that “rematches” to its generic and overrides the default value of a generic formal argument to NULL no longer drops the argument from its formals.

  • Rscript can now accept more than one argument given on the #! line of a script. Previously, one could only pass a single argument on the #! line in Linux.

  • Connections are now written correctly with encoding "UTF-16LE". (PR#16737).

  • Evaluation of ..0 now signals an error. When ..1 is used and ... is empty, the error message is more appropriate.

  • (Windows mainly.) Unicode code points which require surrogate pairs in UTF-16 are now handled. All systems should properly handle surrogate pairs, even those systems that do not need to make use of them. (PR#16098)

  • stopifnot(e, e2, ...) now evaluates the expressions sequentially and in case of an error or warning shows the relevant expression instead of the full stopifnot(..) call.

  • path.expand() on Windows now accepts paths specified as UTF-8-encoded character strings even if not representable in the current locale. (PR#17120)

  • line(x, y) now correctly computes the medians of the left and right group’s x-values and in all cases reproduces straight lines.

  • Extending S4 classes with slots corresponding to special attributes like dim and dimnames now works.

  • Fix for legend() when fill has multiple values the first of which is NA (all colours used to default to par(fg)). (PR#17288)

  • installed.packages() did not remove the cached value for a library tree that had been emptied (but would not use the old value, just waste time checking it).

  • The documentation for installed.packages(noCache = TRUE) incorrectly claimed it would refresh the cache.

  • aggregate(<data.frame>) no longer uses spurious names in some cases. (PR#17283)

  • object.size() now also works for long vectors.

  • packageDescription() tries harder to solve re-encoding issues, notably seen in some Windows locales. This fixes the citation() issue in PR#17291.

  • poly(<matrix>, 3) now works, thanks to prompting by Marc Schwartz.

  • readLines() no longer segfaults on very large files with embedded ’\\0’ (aka ‘nul’) characters. (PR#17311)

  • ns() (package splines) now also works for a single observation. interpSpline() gives a more friendly error message when the number of points is less than four.

  • dist(x, method = "canberra") now uses the correct definition; the result may only differ when x contains values of differing signs, e.g. not for 0-1 data.

  • methods:::cbind() and methods:::rbind() avoid deep recursion, thanks to Suharto Anggono via PR#17300.

  • Arithmetic with zero-column data frames now works more consistently; issue raised by Bill Dunlap.

    Arithmetic with data frames gives a data frame for ^ (which previously gave a numeric matrix).

  • pretty(x, n) for large n or large diff(range(x)) now works better (though it was never meant for large n); internally it uses the same rounding fuzz (1e-10) as seq.default() — as it did up to 2010-02-03 when both were 1e-7.

  • Internal C-level R_check_class_and_super() and hence R_check_class_etc() now also consider non-direct super classes and hence return a match in more cases. This e.g., fixes behaviour of derived classes in package Matrix.

  • Reverted unintended change in behavior of return calls in on.exit expressions introduced by stack unwinding changes in R 3.3.0.

  • Attributes on symbols are now detected and prevented; attempt to add an attribute to a symbol results in an error.

  • fisher.test(*, workspace = <n>) now may also increase the internal stack size which allows larger problem to be solved, fixing PR#1662.

  • The methods package no longer directly copies slots (attributes) into a prototype that is of an “abnormal” (reference) type, like a symbol.

  • The methods package no longer attempts to call length<-() on NULL (during the bootstrap process).

  • The methods package correctly shows methods when there are multiple methods with the same signature for the same generic (still not fully supported, but at least the user can see them).

  • sys.on.exit() is now always evaluated in the right frame. (From Lionel Henry.)

  • seq.POSIXt(*, by = "<n> DSTdays") now should work correctly in all cases and is faster. (PR#17342)

  • .C() when returning a logical vector now always maps values other than FALSE and NA to TRUE (as documented).

  • Subassignment with zero length vectors now coerces as documented (PR#17344).
    Further, x <- numeric(); x[1] <- character() now signals an error replacement has length zero (or a translation of that) instead of doing nothing.

  • (Package parallel.) mclapply(), pvec() and mcparallel() (when mccollect() is used to collect results) no longer leave zombie processes behind.

  • R CMD INSTALL <pkg> now produces the intended error message when, e.g., the LazyData field is invalid.

  • as.matrix(dd) now works when the data frame dd contains a column which is a data frame or matrix, including a 0-column matrix/d.f. .

  • mclapply(X, mc.cores) now follows its documentation and calls lapply() in case mc.cores = 1 also in the case mc.preschedule is false. (PR#17373)

  • aggregate(<data.frame>, drop=FALSE) no longer calls the function on <empty> parts but sets corresponding results to NA. (Thanks to Suharto Anggono’s patches in PR#17280).

  • The duplicated() method for data frames is now based on the list method (instead of string coercion). Consequently unique() is better distinguishing data frame rows, fixing PR#17369 and PR#17381. The methods for matrices and arrays are changed accordingly.

  • Calling names() on an S4 object derived from "environment" behaves (by default) like calling names() on an ordinary environment.

  • read.table() with a non-default separator now supports quotes following a non-whitespace character, matching the behavior of scan().

  • parLapplyLB and parSapplyLB have been fixed to do load balancing (dynamic scheduling). This also means that results of computations depending on random number generators will now really be non-reproducible, as documented.

  • Indexing a list using dollar and empty string (l$"") returns NULL.

  • Using \\usage{ data(<name>, package="<pkg>") } no longer produces R CMD check warnings.

  • match.arg() more carefully chooses the environment for constructing default choices, fixing PR#17401 as proposed by Duncan Murdoch.

  • Deparsing of consecutive ! calls is now consistent with deparsing unary - and + calls and creates code that can be reparsed exactly; thanks to a patch by Lionel Henry in PR#17397. (As a side effect, this uses fewer parentheses in some other deparsing involving ! calls.)

Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Team, "Changes in R", The R Journal, 2018

BibTeX citation

@article{RJ-2018-1-ch,
  author = {Team, R Core},
  title = {Changes in R},
  journal = {The R Journal},
  year = {2018},
  note = {https://rjournal.github.io/},
  volume = {10},
  issue = {1},
  issn = {2073-4859},
  pages = {561-570}
}