Here I present the hyper2 package for generalized Bradley-Terry models and give examples from two competitive situations: single scull rowing, and the competitive cooking game show MasterChef Australia. A number of natural statistical hypotheses may be tested straightforwardly using the software.
The Bradley-Terry model for datasets involving paired comparisons has
wide uptake in the R community. However, existing functionality
Applications are legion. The technique is widely used in a competitive
sport context (Turner and Firth 2012), in which matches are held between two
opposing individuals or teams. It can also be applied to consumer choice
experiments in which subjects are asked to choose a favourite from among
two choices (Hatzinger and Dittrich 2012), in which case the
If player
and this fact may be used to estimate
Observing the winner
Consider, for example, the following inference problem. Suppose we wish
to make inferences about
The hyper2 package uses the obvious solution to this problem: work
with equation (1), rather than
equation (2) and store only nonzero
powers. However, this requires one to keep track of which subsets
of
One natural representation for the likelihood
function
One such mechanism is furnished by the C++
Standard Template Library’s
“map” class (Musser et al. 2009) to store and retrieve elements. In STL
terminology, a map is an associative container that stores values
indexed by a key, which is used to sort and uniquely identify the
values. In the package, the key is a (STL) set
of strictly positive
integers typedef
statements are:
typedef set<unsigned int> bracket;
typedef map<bracket, double> hyper2;
Thus a bracket
object is a set of (unsigned) integers—here a sum of
some hyper2
object is a function that maps bracket
objects to real numbers—here their power. The following C++
pseudocode shows how the aforementioned likelihood function would be
created:
const bracket b1.insert({1,2}); // b1 = (p1+p2)
const bracket b2.insert({1,3}); // b2 = (p1+p3)
hyper2 L; // L2 is the likelihood function
// first observation:
L[b1] = 1; // L = (p1+p2)
//second observation:
L[b1] += 1; // L = (p1+p2)^2 # updating of existing map element
L[b2] += 1; // L = (p1+p2)^2*(p1+p3)^1
In the STL
, a map object stores keys and associated values in whatever
order the software considers to be most propitious. This allows faster
access and modification times but the order in which the maps, and
indeed the elements of a set, are stored is not defined. In the case of
likelihood functions such as Equation (1), this
is not an issue because both multiplication and addition are associative
and commutative operations. One side-effect of using this system is that
the order of the bracket-power key-value pairs is not preserved.
Topalov | Anand | Karpov | total |
---|---|---|---|
22 | 13 | - | 35 |
- | 23 | 12 | 35 |
8 | - | 10 | 18 |
30 | 36 | 22 | 88 |
Consider the Chess
dataset of the hyperdirchlet package, in which
matches between three chess players are tabulated (Table 1).
The Bradley-Terry model (Bradley and Terry 1952) is appropriate here (Caron and Doucet 2012),
and the hyper2 package provides a likelihood function for the
strengths of the players,
Using the hyper2 package, the R idiom to create this likelihood function would be a two-stage process. The first step would be to implement the numerator, that is the number of games won by each player:
R> library("hyper2")
R> chess <- hyper2(list(1, 2, 3), c(30, 36, 22))
R> chess
p1^30 * p2^36 * p3^22
Thus the chess
object has the correct number of players (three), and
has the numerator recorded correctly. To specify the denominator, which
indicates the number of matches played by each pair of players, the
package allows the following natural idiom:
R> chess[c(1, 2)] <- -35
R> chess[c(2, 3)] <- -35
R> chess[c(1, 3)] <- -18
R> chess
p1^30 * (p1 + p2)^-35 * (p1 + p3)^-18 * p2^36 * (p2 + p3)^-35 * p3^22
Note how the terms appear in an essentially random order, a side-effect of the efficient map class. It is sometimes desirable to name the elements explicitly:
R> pnames(chess) <- c("Topalov", "Anand", "Karpov")
R> chess
Topalov^30 * (Topalov + Anand)^-35 * (Topalov + Karpov)^-18 * Anand^36
* (Anand + Karpov)^-35 * Karpov^22
The package can calculate log-likelihoods:
R> loglik(chess, c(1/3, 1/3))
[1] -60.99695
[the second argument of function loglik()
is a vector of length 2,
third element of gradient()
:
R> gradient(chess, c(1/3, 1/3))
[1] 24.0 16.5
Such functionality allows the rapid location of the maximum likelihood
estimate for
R> maxp(chess)
Topalov Anand Karpov
0.4036108 0.3405168 0.2558723
In this section, I will take results from the 2016 Summer Olympic Games
and create a likelihood function for the finishing order in Men’s single
sculling. In Olympic sculling, up to six individual competitors race a
small boat called a scull over a course of length 2 km; the object is to
cross the finishing line first. Note that actual timings are irrelevant,
given the model, as the sufficient statistic is the order in which
competitors cross the finishing line. The 2016 Summer Olympics is a case
in point: the gold and silver medallists finished less than 5
milliseconds apart, corresponding to a lead of
However, there is information in the whole of the finishing order, not
just the first across the line. Once the winner has been identified,
then the runner-up may plausibly be considered to be the winner among
the remaining competitors; and so on down the finishing order. Without
loss of generality, if the order of finishing were rowing.txt
). The first step to incorporating the whole
finishing order into a likelihood function is to define a hyper2
object which stores the names of the participants:
R> data("rowing")
R> H <- hyper2(pnames = allrowers)
R> H
(banna + bhokanal + boudina + cabrera + campbell + dongyong + drysdale
+ esquivel + fournier + gambour + garcia + grant + hoff + kelmelis +
khafaji + kholmirzayev + martin + memo + molnar + monasterio + obreno +
peebles + rivarola + rosso + saensuk + shcharbachenia + synek +
szymczyk + taieb + teilemb + yakovlev + zambrano)^0
Observe that the resulting likelihood function is uniform, as no
information has as yet been included. Incorporating the information from
Heat 1 into a likelihood function corresponding to
Equation (3) is straightforward
using the order_likelihood()
function:
R> heat1 <- c("fournier", "cabrera", "bhokanal", "saensuk",
+ "kelmelis", "teilemb")
R> H <- H + order_likelihood(char2num(heat1, allrowers))
R> H
bhokanal * (bhokanal + cabrera + fournier + kelmelis + saensuk +
teilemb)^-1 * (bhokanal + cabrera + kelmelis + saensuk + teilemb)^-1 *
(bhokanal + kelmelis + saensuk + teilemb)^-1 * cabrera * fournier *
kelmelis * (kelmelis + saensuk + teilemb)^-1 * (kelmelis + teilemb)^-1
* saensuk
(variable heat1
shows the finishing order for Heat 1). Again observe
that object H
includes its terms in no apparent order. Although it
would be possible to incorporate information from subsequent heats in a
similar manner, the package includes a ready-made dataset, sculls2016
:
R> head(sculls2016)
banna^4 * (banna + boudina + cabrera + molnar + obreno + rivarola)^-1 *
(banna + boudina + cabrera + molnar + rivarola)^-1 * (banna + boudina +
molnar + rivarola)^-1 * (banna + cabrera + campbell + grant + hoff)^-1
* (banna + cabrera + campbell + grant + hoff + szymczyk)^-1
Finding the maximum likelihood estimate for the
parameter maxp()
function, provided with the
package (Figure 1). The optimization routine has
access to derivatives which means that the estimate is found very
quickly.
R> dotchart(maxp(sculls2016))
Figure 1 shows very clearly that the competitor with the highest strength is Drysdale, the gold medallist for this event. The bronze and silver medallists were Synek and Martin respectively, whose estimated strengths were second and third highest in the field.
MasterChef Australia is a game show in which amateur cooks compete for a
title (Wikipedia 2017a). From a statistical perspective the
contest is interesting because the typical show format is to identify
the weakest player, who is then eliminated from the competition. Here,
results from MasterChef Australia
Series 6 (Wikipedia 2017b) will be analysed; an
extended discussion of the data used is given in the package
at masterchef.Rd
.
We wish to make inferences about the contestents’ generalized
Bradley-Terry strengths
R> team_red <- c("Jamie", "Tracy", "Ben", "Amy", "Renae", "Georgia")
R> team_blue <- c("Brent", "Laura", "Emelia", "Colin", "Kira", "Tash")
We may represent the fact that the red team won as
A plausible likelihood function can be generated using the standard
assumption (Hankin 2010) that the competitive strength of a team is the
sum of the strengths of its members. The likelihood function for the
observation given in Equation (4) would then be
To generate a likelihood function in R, we need to set up a hyper2
object with appropriate contestants:
R> H <- hyper2(pnames = c(
+ "Amy", "Ben", "Brent", "Colin", "Emelia",
+ "Georgia", "Jamie", "Kira", "Laura", "Renae",
+ "Sarah", "Tash", "Tracy"))
R> H
(Amy + Ben + Brent + Colin + Emelia + Georgia + Jamie + Kira + Laura +
Renae + Sarah + Tash + Tracy)^0
Object H
is a uniform likelihood function. The package R idiom for
incorporating likelihood from Equation (5) is
straightforward and natural:
R> H[team_red] <- +1
R> H[c(team_red, team_blue)] <- -1
R> H
(Amy + Ben + Brent + Colin + Emelia + Georgia + Jamie + Kira + Laura +
Renae + Tash + Tracy)^-1 * (Amy + Ben + Georgia + Jamie + Renae +
Tracy)
(Sarah did not take part). The above idiom makes it possible to define likelihoods for observations that have a peculiar probability structure, and I give two examples below.
One event involved eight competitors who were split randomly into four teams of two. The show format was specified in advance as follows: The teams were to be judged, and placed in order. The two top teams were to be declared safe, and the other two teams sent to an elimination trial from which an individual winner and loser were identified, the loser being obliged to leave the competition. The format for this event is also typical in MasterChef.
The observation was that Laura and Jamie’s team won, followed by Emelia
and Amy, then Brent and Tracy. Ben and Renae’s team came last:
Again assuming that the team strength is the sum of its members’
strengths, a likelihood function for this observation may be obtained by
using the order statistic technique of (Plackett 1975):
H
, which is a likelihood function for the two-team challenge
discussed above. The corresponding package idiom is natural:
R> blue <- c("Laura", "Jamie")
R> yellow <- c("Emelia", "Amy")
R> green <- c("Brent", "Tracy")
R> red <- c("Ben", "Renae")
(the teams were randomly assigned a colour). We may now generate a likelihood function for the observation that the order of teams was blue, yellow, green, red, as per Equation (6):
R> H[blue] <- 1
R> H[c(blue, yellow, green, red)] <- -1
R> H[yellow] <- 1
R> H[c(yellow, green, red)] <- -1
R> H[green] <- 1
R> H[c(green, red)] <- -1
R> H
(Amy + Ben + Brent + Colin + Emelia + Georgia + Jamie + Kira + Laura +
Renae + Tash + Tracy)^-1 * (Amy + Ben + Brent + Emelia + Jamie + Laura
+ Renae + Tracy)^-1 * (Amy + Ben + Brent + Emelia + Renae + Tracy)^-1 *
(Amy + Ben + Georgia + Jamie + Renae + Tracy) * (Amy + Emelia) * (Ben +
Brent + Renae + Tracy)^-1 * (Brent + Tracy) * (Jamie + Laura)
We may incorporate subsequent observations relating to the elimination
trial among the four competitors comprising the losing two teams. The
observation was that Laura won, and Renae came last, being eliminated.
We might write
R> L <- ggol(H,
+ winner = "Laura",
+ btm4 = c("Brent", "Tracy", "Ben"),
+ eliminated = "Renae")
Arguments to ggol()
are disjoint subsets of the players, the subsets
themselves being passed in competition order from best to worst. Object
L
includes information from the team challenge (via first argument
H
) and the elimination results. It is a list of length hyper2
, each of which gives a Luce (1959) likelihood function
for a consistent total ordering of the competitors.
A final example (taken from MasterChef series 8, week 10) is given as a
generalization of the Luce (1959) likelihood. The format was as follows.
Eight contestents were split randomly into four teams of two, the top
two teams being declared safe. Note that the likelihood interpretation
differs from the previous team challenge, in which the observation was
an unambiguous team ranking: here, there is only a partial ranking of
the teams and one might expect this observation to be less informative.
Without loss of generality, the result may be represented as
The package provides an overall likelihood function for all informative
judgements in the series on the final 13 players in object
masterchef_series6
. We may assess a number of related hypotheses using
the package. The first step is to calculate the likelihood for the
hypothesis that all players are of equal strength:
R> data("masterchef")
R> n <- 13
R> equal_strengths <- rep(1/n,n-1)
R> like_series(equal_strengths, masterchef_series6)
[1] -78.68654
The strengths of the 13 players may be estimated using standard maximum likelihood techniques. This requires constrained optimization in order to prevent the search from passing through inadmissible points in p-space:
R> UI <- rbind(diag(n-1), -1)
R> CI <- c(rep(0, n-1), -1)
R> constrOptim(
+ theta = equal_strengths,
+ f = function(p){-like_series(p, L)},
+ ui = UI, ci = CI,
+ grad = NULL)
R> pmax_masterchef6
Amy Ben Brent Colin Emelia Georgia
1.086182e-01 7.457970e-02 1.343553e-01 2.819606e-02 1.169766e-01 6.850455e-09
Jamie Kira Laura Renae Sarah Tash
1.065412e-01 2.055794e-02 2.750621e-01 4.070643e-02 2.803893e-02 1.142094e-09
Tracy
6.636755e-02
R> dotchart(pmax_masterchef6)
In the above code, UI
enforces CI
enforces pmax_masterchef6
in the package, is shown
pictorially in Figure 2. The support at the
precalculated evaluate is
R> like_series(indep(pmax_masterchef6), masterchef_series6)
[1] -66.19652
and this allows us to test the hypothesis of equal player strengths: by
Wilks’s theorem (Wilks 1938) the quantity
R> pchisq(2*(78.7-66.2), df = 12, lower.tail = FALSE)
[1] 0.01482287
showing that the observations do constitute evidence for differential
player strengths. Figure 2 suggests that Laura, the
runner-up, is actually a stronger competitor than the winner, Brent. We
may assess this statistically by finding the maximum likelihood
for
R> UI <- rbind(UI, c(0, 0, 1, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0))
R> CI <- c(CI, 0)
R> ans2 <-
+ constrOptim(
+ theta = equal_strengths,
+ f = function(p){-like_series(p, masterchef_series6)},
+ grad = NULL,
+ ui = UI, ci = CI)
(updated object UI
represents the constraint that Brent’s strength
exceeds that of Laura). In the package, object
pmax_masterchef6_constrained
is included as the result of the above
optimization, at which point the likelihood is
R> like_series(indep(pmax_masterchef6_constrained), masterchef_series6)
[1] -67.37642
The two estimates differ by about
However, further work would be needed to make statistically robust
inferences from these findings. Suppose, for example, that all
competitors have equal competitive ability: then all the pmax()
. It
is not clear at this stage how to interpret likelihood functions for
players conditional on their competition performance. Another issue
would be the applicability of Wilks’s theorem (Wilks 1938) which states
only that the asymptotic distribution of
Several generalizations of Bradley-Terry strengths are appropriate to describe competitive situations in which order statistics are sufficient.
The hyper2
package is introduced, providing a suite of functionality
for generalizations of the partial rank analysis of (Critchlow 1985).
The software admits natural R idiom for translating commonly occurring
observations into a likelihood function.
The package is used to calculate maximum likelihood estimates for generalized Bradley-Terry strengths in two competitive situations: Olympic rowing, and MasterChef Australia. The estimates for the competitors’ strengths are plausible; and several meaningful statistical hypotheses are assessed quantitatively.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Hankin, "Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models", The R Journal, 2017
BibTeX citation
@article{RJ-2017-061, author = {Hankin, Robin K. S.}, title = {Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models}, journal = {The R Journal}, year = {2017}, note = {https://rjournal.github.io/}, volume = {9}, issue = {2}, issn = {2073-4859}, pages = {429-439} }