R Teaching Column

The revisit package, developed as a collaborative tool for scientists, also serves as a tool for teaching statistics, in a manner that can be highly motivating for students. Using either the included case studies or datasets/code provided by the instructor, students can explore several alternate paths of analysis, such as the effects of including/excluding certain variables, employing different types of statistical methodology and so on. The package includes features that help students follow modern statistical standards and avoid various statistical errors, such as “p-hacking” and lack of attention to outlier data.

Matthias Gehrke (FOM University of Applied Sciences) , Reed Davis , Norman Matloff (University of California, Davis) , Paul Thompson (Sanford Research) , Tiffany Chen (University of California, Davis) , Emily Watkins (University of California, Davis) , Laurel Beckett (University of California, Davis)
2017-12-01

1 Invitation to collaborate

With this issue of the R Journal, there will be a section for teaching R and teaching using R as well as for empirical research on teaching R.

Teaching material

A GitHub repository will be setup where the teaching material can be accessed. All material should be under some licence of Creative Commons to allow re-use. Please prepare a short text describing your material. For a sample, see the description by Norman Matloff et al. (below) of class room material he is using.

Research on teaching R

Some research is considering the relative success of different approaches to teaching R and teaching statistics with R. For example, see some contributions on the recent useR!2017 in Brussels (Gert Janssenwill et al., The analysis of R learning styles with R, Matthias Gehrke and Karsten Luebke, A quasi-experiment for the influence of the user interface on the acceptance of R). The R Journal is interested in receiving such submissions.

Technical details

This column will appear before this GitHub repository is established, so only links to an existing cooperating repository and mailing list can be provided at this time. Subsequent columns will provide full technical details or links to such details.

2 The ‘revisit’ Package As a Teaching Tool

Overview

The R package revisit was developed in response to the recent concern over reproducibility in research, especially problems related to statistical analysis (Baker 2016). The motivation and methods are detailed in Matloff, R. Davis, L. Beckett, and P. Thompson (2017), but in this paper we turn to teaching.

The package has both text and GUI versions, the latter being based on the RStudio integrated development environment for R. It is currently at an early stage of development, with more features and refinements being added continuously. It may be downloaded from https://github.com/matloff/revisit.

One can obtain a quick introduction to the package by starting it and running one of the case studies. To start the package in GUI form, click Addins in RStudio, and choose “Revisit”. The text version is not colorful, but is more flexible. To run it, simply run library(revisit). Use of the case studies will be introduced later in this paper.

An excellent motivator: the Reinhart/Rogoff study controversy

The package includes a number of examples for student examination and participation. One of the case studies involves the controversial paper by Harvard economists C. Reinhart and K. Rogoff (Cassidy 2016). They had found that nations with high budget deficits average a \(-0.1\)% growth rate in GDP. This finding had major impact on policymakers, with the paper attracting particular interest from the “deficit hawks” in the U.S. Congress. The Washington Post took to describing the finding as “consensus among economists.”

But when later researchers tried alternate analysis, the picture changed entirely. Some data had been excluded from the initial published analysis, for what arguably were weak reasons. The original analysis also had the flaw of giving equal weights to all nations, regardless of size, as well as other aspects that some researchers considered flaws. When the alternate analysis was run, the figure \(-0.1\)% changed to +2.2%. Thus the original findings on deficit spending now seem questionable. This leads to the theme in the present paper, use of revisit for teaching.

At a talk given by one of the authors of the present paper (Matloff 2017), the author was stunned at student reaction to the Reinhart/Rogoff example. The students, all doing graduate work in science and engineering, were captivated by the fact that the study, which had had such influence on policy, may have been seriously flawed, with alternate analyses of the same data yielding starkly different results. The potential of revisit as a teaching tool had been suggested earlier by one of the other authors of the present paper, but the strong student reaction here dramatized the point.

General structure of the package

The package is structured as follows:

The case study approach

The package includes various case studies, consisting of data and code. The latter comprises a complete, though possibly brief, analysis of the data from start to finish — the code may include data configuration, data cleaning, preliminary graphical analysis, predictor variable/model selection, and so on. Students can then try their own alternative analyses. Most of the case studies are not as dramatic as Reinhart and Rogoff, but each illustrates important concepts in data analysis.

There are just a few case studies included in the package currently, but more are being added, and instructors will often prefer to use their own data anyway. In many cases, the code will be minimal, affording the students even more opportunity for nonpassive thought and exploration.

Assignments centered on the use of revisit can be very specific or more open-ended, according to the instructor’s preference. Here are possible samples:

We are continuing to add case studies. At this early stage, the list includes:

We anticipate that instructors will typically develop their own case studies. We encourage them to contribute these to revisit.

To load a case study in the GUI, just select the desired one from the Case Studies window. The code will then be loaded, ready to run. For the text version, at present this is done as in this example, for the Currency data:

fname <- system.file('CaseStudies/Currency/currency.R',package='revisit')
loadb(fname) 

A “statistical audit”

One of the most important features of revisit is that it plays the role of a “statistical audit,” in much the same way that tax preparation software might warn of questionable claims in a tax return.

Much of this is accomplished by wrapper functions provided by revisit. For instance, if the student wishes to form a confidence interval for a mean in R, she might use t.test(). But in revisit she can instead use t.test.rv(). Instead of lm() she can call lm.rv(), a wrapper for lm() that adds “statistical audit” functionality.

The wrapper lm.rv()

As noted, this function calls lm() and returns an object of class "lm", but with additional functionality. At present this takes on two forms:

“Audits” of signficance testing

The package aims to reduce p-hacking by monitoring the number of inference actions — p-values, confidence intervals — the student has accumulated in his/her analysis, and may issue a warning that the student should consider employing multiple-inference methods. For the latter, at present the package offers just Bonferroni’s Method, but more will be added (Hsu 1996).

Moreover, the package responds to the dramatic 2016 announcement by the American Statistical Assocation (Wasserstein and N. A. Lazar 2016), which warned on the overuse of p-values. Though this problem had been common knowledge for many years (Freedman, R. Pisani, and R. Purves 1978; Jones and N. Matloff 1986; Ziliak and D. McCloskey 2008), the ASA announcement gave new urgency to the issue. The revisit package takes an active role in encouraging students not to rely much on p-values in the first place. Confidence interval-based analysis is preferred.

Here is the code for t.test.rv():

> t.test.rv
function (x, y, alpha = 0.05, bonf = 1) 
{
    alpha <- alpha/bonf
    tout <- t.test(x, y, conf.level = 1 - alpha)
    muhat1 <- tout$estimate[1]
    muhat2 <- tout$estimate[2]
    tout$p.value <- tout$p.value * bonf
    rvenv$pcount <<- rvenv$pcount + 1
    if (tout$p.value < alpha && muhat1 != 0) {
        if (abs(muhat1 - muhat2)/abs(muhat1) < rvenv$smalleffect) 
            warning(paste("small p-value but effect size", 
            "could be of little practical interest"))
    }
    tout
}

Here the argument bonf is a multiplicative factor used to expand confidence interval widths for Bonferroni corrections, as seen in the above code.

Note the incrementing of rvenv$pcount. This is the global count (which includes potential confidence intervals) alluded to earlier, to be used in the warning that the user should consider multiple inference methods.1 Note too that the code may warn the student that there was a “small p-value but effect size could be of little practical interest.”

Other wrappers

Steering students (and their instructors) to confidence intervals instead of p-values can be difficult not only in terms of breaking habits, but also in technical terms. Consider the log-linear model. for instance. Most packages perform log-lin solely from a hypothesis testing point of view. R’s stats::loglin() function, for instance, will not provide standard errors, and only provides point estimates on request. Our package will go further, offering point estimates and standard errors for estimated cell probabilities. As the user steps through the model hierarchy, at a certain point it will become clear that the estimates are not changing in important ways, and one can stop the model selection process. This will be accomplished by the “Poisson trick”(Christensen 2013), in conjunction with R’s glm() function.

Generating and managing alternative analyses

After a student selects an example from the Case Study menu, the package loads the desired code and data, and enters the code into the package’s visual text editor. The student can now edit and run the revised code, including just a partial run, up to a given line.

The latter capability is especially useful. Say the code consists of 32 lines, and line 11 is of interest to the student. The student can direct revisit to run lines 1-10 of the code, then pause. At that point the student can try executing an alternative to line 11, by executing the alternative line in the Console box, which provides direct access to the RStudio R console in which execution takes place. The student can then resume execution of the code starting from line 12. This allows the student to quickly try alternative code without actually changing the contents of the text editor.

In generating various alternative analyses, the student can save the interesting ones, each version in a different file, but all managed conveniently by revisit. Borrowing from software engineering terminology, each of these files is called a branch.

Example: Pima diabetes study

Let’s start with a simple and quite well-known example, the Pima diabetes data. The authors (Baker 2016) used a form of neural networks to predict diabetes from variables such as BMI and insulin. To keep things simple, though, we will not engage in prediction analysis here.

Upon launching revisit in the GUI, a new window associated with the RStudio session then pops up, and the screen then looks like Figure 1. Pima is the first case study listed.

The included code here consists of just forming confidence intervals comparing diabetic and nondiabetic groups, on each of the eight predictor variables. The student can choose to run the entire code or just a part; say she chooses the former. The confidence intervals can then be seen in the R console portion of the RStudio window (Figure 2).

At this point, the student may ask, “What if we make the Bonferroni correction here?” Although she could edit line 15 in the revisit text editor window to

tmp <- t.test.rv(diab[ ,i], nondiab[ ,i])$conf.int

and then re-run, if this is just a tentative change, she could avoid changing the code, by entering the above into the Console box, as seen in Figure 3. As expected, the confidence intervals become wider (not shown). If the student wishes to make further changes, she may now wish to make the above change in the text editor, and possibly save the new code into a new branch.

After the Pima dataset was curated, there were reports of erroneous values in some data points. To investigate this, one might run the discparcoord() function from the package cdparcoord, included in revisit:

discparcoord(pima,k=769)

Here the second argument specifies forming the graph on all 769 records in the data.2

The resulting graph will be displayed in the Viewer pane in the RStudio window. In the text version, the student would run the above directly, in which case the graph appears in the student’s Web browser.

The result is shown in Figure 4. This is a parallel coordinates plot (Inselberg 2009). Each data point is displayed as a segmented line connecting dots at heights given by the values of the variables in that data point.3

The results are rather striking. We see that there are data points having the impossible value of 0 for variables such as glucose and blood pressure. Indeed, there are some data points with multiple 0s. Clearly the student will need to remove some of the data points, and re-run the analyses.

The student may wish to create multiple versions of the code, e.g. in this case, code with and without the erroneous points. The revisit package facilitates this, creating branches 0, 1, 2 and so on of the code.

Future development

As mentioned, a number of wrapper functions are planned, as well as expanding the “statistical audit” features of existing wrappers. Much more graphical analyis is slated.

The number of case studies will continue to grow.

Conclusions

The revisit package facilitates nonpassive, hands-on exploration of statistical and data analytic methodology on real data. The students learn that even published data is not sacrosanct, and that alternate analysis can yield additional insight into the phenomena under study.

Statistical methodology pervades almost every conceivable aspect of our world today. In addition to the students’ possible future usage of statistics in professional roles, educators should prepare them to act as informed, critical thinkers in their roles as citizens and consumers. We hope that revisit can play a part in achieving such goals.

It is also hoped that instructors and students will contribute suggestions for improvement, including pull requests on GitHub. These, and of course new datasets, would be highly appreciated.

Acknowledgements

We are grateful to Bohdan Khomtchouk for inviting one of the authors to speak on revisit at the inaugural meeting of the Stanford R Group, and to Karen Wu for a careful reading of the manuscript.

graphic without alt text
Figure 1: Opening screen.
graphic without alt text
Figure 2: Ordinary CIs.
graphic without alt text
Figure 3: Console box.