We discuss how news in R and add-on packages can be disseminated. R 2.10.0 has added facilities for computing on news in a common plain text format. In R 2.12.0, the Rd format has been further enhanced so that news can be very conveniently recorded in Rd, allowing for both improved rendering and the development of new news services.
The R development community has repeatedly discussed how to best represent and disseminate the news in R and R add-on packages. The GNU Coding Standards (http://www.gnu.org/prep/standards/html_node/NEWS-File.html) say
In addition to its manual, the package should have a file named
NEWS
which contains a list of user-visible changes worth mentioning. In each new release, add items to the front of the file and identify the version they pertain to. Don’t discard old items; leave them in the file after the newer items. This way, a user upgrading from any previous version can see what is new.If the
NEWS
file gets very long, move some of the older items into a file namedONEWS
and put a note at the end referring the user to that file.
For a very long time R itself used a three-layer (series, version,
category) plain text NEWS
file recording individual news items in
itemize-type lists, using ‘o’ as item tag and aligning items to a common
column (as quite commonly done in open source projects).
R 2.4.0 added a function readNEWS()
(eventually moved to package
tools) for reading the R NEWS
file into a news hierarchy (a series
list of version lists of category lists of news items). This makes it
possible to compute on the news (e.g. changes are displayed daily in the
RSS feeds available from
http://developer.r-project.org/RSSfeeds.html), and to transform the
news structure into sectioning markup (e.g. when creating an HTML
version for reading via the on-line help system). However, the items
were still plain text: clearly, it would be useful to be able to
hyperlink to resources provided by the on-line help system
(Murdoch and Urbanek 2009) or other URLs, e.g., those
providing information on bug reports. In addition, it has always been
quite a nuisance to add LaTeX markup to the news items for the release
notes in the R Journal (and its predecessor, R News). The R Core team
had repeatedly discussed the possibilities of moving to a richer markup
system (and generating plain text, HTML and LaTeX from
this), but had been unable to identify a solution that would be
appropriate in terms of convenience as well as the effort required.
R add-on packages provide news-type information in a variety of places
and plain text formats, with a considerable number of NEWS
files in
the package top-level source directory or its inst
subdirectory, and
in the spirit of a one- or two-layer variant of the R news format.
Whereas all these files can conveniently be read by “users”, the lack of
standardization implies that reliable computation on the news items is
barely possible, even though this could be very useful. For example,
upgrading packages could optionally display the news between the
available and installed versions of packages (similar to the
apt-listchanges
facility in Debian-based Linux systems). One could
also develop repository-level services extracting and disseminating the
news in all packages in the repository (or suitable groups of these,
such as task views (Zeileis 2005)) on a regular basis. In
addition, we note that providing the news in plain text limits the
usefulness in package web pages and for possible inclusion in package
reference manuals (in PDF format).
R 2.10.0 added a function news()
(in package utils) to possibly
extract the news for R or add-on packages into a simple (plain text)
data base, and use this for simple queries. The data base is a character
frame with variables Version
, Category
, Date
and Text
, where the
last contains the entry texts read, and the other variables may be NA
if they were missing or could not be determined. These variables can be
used in the queries. E.g., we build a data base of all currently
available R news entries:
> db <- news()
This has 2950 news entries, distributed according to version as follows:
> with(db, table(Version))
Version2.0.1 2.0.1 patched 2.1.0
56 27 223
2.1.1 2.1.1 patched 2.10.0
56 31 153
2.10.1 2.10.1 patched 2.11.0
43 35 132
2.11.1 2.2.0 2.2.1
28 188 63
2.2.1 patched 2.3.0 2.3.1
18 250 38
2.3.1 patched 2.4.0 2.4.1
28 225 60
2.4.1 patched 2.5.0 2.5.1
15 208 50
2.5.1 patched 2.6.0 2.6.1
16 177 31
2.6.2 2.6.2 patched 2.7.0
53 14 186
2.7.1 2.7.2 2.7.2 patched
66 45 6
2.8.0 2.8.1 2.8.1 patched
132 59 27
2.9.0 2.9.1 2.9.2
124 56 26
2.9.2 patched
5
(This shows “versions” such as ‘2.10.1 patched’, which correspond to news items in the time interval between the last patchlevel release and the next minor or major releases of R (in the above, 2.10.1 and 2.11.0), respectively. These are not valid version numbers, and are treated as the corresponding patchlevel version in queries.) To extract all bug fixes since R 2.11.0 that include a PR number from a formal bug report, we can query the news db as follows:
> items <- news(Version >= "2.11.0" &
+ grepl("^BUG", Category) &
+ grepl("PR#", Text),
+ db = db)
This finds 22 such items, the first one being
> writeLines(strwrap(items[1, "Text"]))
function mkCharLenCE now no longer
The C 'len' bytes (unlikely to be a
reads past in user code). (PR#14246) problem except
Trying to extract the news for add-on packages involved a
trial-and-error process to develop a reasonably reliable default reader
for the NEWS
files in a large subset of available packages, and
subsequently documenting the common format (requirements). This reader
looks for version headers and splits the news file into respective
chunks. It then tries to read the news items from the chunks, after
possibly splitting according to category. For each chunk found, it
records whether (it thinks) it successfully processed the chunk. To
assess how well this reader actually works, we apply it to all NEWS
files in the CRAN packages. As of 2010-09-11, there are 427 such files
for a “news coverage” of about 20%. For 67 files, no version chunks are
found. For 40 files, all chunks are marked as bad (indicating that we
found some version headers, but the files really use a different
format). For 60 files, some chunks are bad, with the following summary
of the frequencies of bad chunks:
1st Qu. Median Mean 3rd Qu.
Min. 0.005747 0.057280 0.126800 0.236500 0.263800
Max. 0.944400
Finally, for 260 files, no chunks are marked as bad, suggesting a success rate of about 61 percent. Given the lack of formal standardization, this is actually “not bad for a start”, but certainly not good enough. Clearly, package maintainers could check readability for themselves and fix incompatibilities with the default format, or switch to it (or contribute readers for their own formats). But ideally, maintainers could employ a richer format allowing for more reliable computations and better processing.
In R 2.12.0, the new Rd format was enhanced in several important ways.
The \subsection
and \newcommand
macros were added, to allow for
subsections within sections, and user-defined macros. One can now
convert R objects to fragments of Rd code, and process these. And
finally, the rendering (of text in particular) is more customizable.
Given the availability of \subsection
, one can now conveniently map
one- and two-layer hierarchies of item lists into sections (and
subsections) of Rd \itemize
lists. Altogether, the new Rd format
obviously becomes a very attractive (some might claim, the canonical)
format for maintaining R and package news information, in particular
given that most R developers will be familiar with the Rd format and
that Rd can be rendered as text, HTML (for on-line help)
and LaTeX (for PDF manuals).
R 2.12.0 itself has switched to the Rd format for recording its news. To
see the effect, run help.start()
, click on the NEWS link and note that
e.g. bug report numbers are now hyperlinked to the respective pages in
the bug tracking system. The conversion was aided by the (internal)
utility function news2Rd()
in package tools which converted the
legacy plain text NEWS
to Rd, but of course required additional effort
enriching the markup.
R 2.12.0 is also aware of inst/NEWS.Rd
files in add-on packages, which
are used by news()
in preference to the plain text NEWS
and
inst/NEWS
files. For NEWS
files in a format which can successfully
be handled by the default reader, package maintainers can use
tools:::news2Rd(dir, "NEWS.Rd")
, possibly with additional argument
codify = TRUE
, with dir
a character string specifying the path to a
package’s root directory. Upon success, the NEWS.Rd
file can be
further improved and then be moved to the inst
subdirectory of the
package source directory. For now, the immediate benefits of switching
to the new format is that news()
will work more reliably, and that the
package news web pages on CRAN will look better.
As packages switch to Rd format for their news, additional services
taking advantage of the new format can be developed and deployed. For
the near future, we are planning to integrate package news into the
on-line help system, and to optionally integrate the news when
generating the package reference manuals. A longer term goal is to
enhance the package management system to include news computations in
apt-listchanges
style. We hope that repository “news providers” will
take advantage of the available infrastructure to enhance their news
services (e.g., using news diffs instead of, or in addition to, the
currently employed recursive diff summaries). But perhaps most
importantly, we hope that the new format will make it more attractive
for package maintainers to actually provide valuable news information.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Hornik & Murdoch, "What's New?", The R Journal, 2010
BibTeX citation
@article{RJ-2010-2-whats-new, author = {Hornik, Kurt and Murdoch, Duncan}, title = {What's New?}, journal = {The R Journal}, year = {2010}, note = {https://rjournal.github.io/}, volume = {2}, issue = {2}, issn = {2073-4859}, pages = {74-76} }