Taming PITCHf/x Data with XML2R and pitchRx

XML2R is a framework that reduces the effort required to transform XML content into tables in a way that preserves parent to child relationships. pitchRx applies XML2R’s grammar for XML manipulation to Major League Baseball Advanced Media (MLBAM)’s Gameday data. With pitchRx, one can easily obtain and store Gameday data in a remote database. The Gameday website hosts a wealth of XML data, but perhaps most interesting is PITCHf/x. Among other things, PITCHf/x data can be used to recreate a baseball’s flight path from a pitcher’s hand to home plate. With pitchRx, one can easily create animations and interactive 3D scatterplots of the baseball’s flight path. PITCHf/x data is also commonly used to generate a static plot of baseball locations at the moment they cross home plate. These plots, sometimes called strike-zone plots, can also refer to a plot of event probabilities over the same region. pitchRx provides an easy and robust way to generate strike-zone plots using the ggplot2 package.

Carson Sievert

CRAN packages used

pitchRx, XML2R, ggplot2, rgl, dplyr, mgcv, knitr

CRAN Task Views implied by cited packages

Graphics, Bayesian, Econometrics, Environmetrics, Multivariate, Phylogenetics, ReproducibleResearch, SocialSciences, SpatioTemporal, WebTechnologies


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".


For attribution, please cite this work as

Sievert, "The R Journal: Taming PITCHf/x Data with XML2R and pitchRx", The R Journal, 2014

BibTeX citation

  author = {Sievert, Carson},
  title = {The R Journal: Taming PITCHf/x Data with XML2R and pitchRx},
  journal = {The R Journal},
  year = {2014},
  note = {https://doi.org/10.32614/RJ-2014-001},
  doi = {10.32614/RJ-2014-001},
  volume = {6},
  issue = {1},
  issn = {2073-4859},
  pages = {5-19}