Content-Based Social Network Analysis of Mailing Lists

Social Network Analysis (SNA) provides tools to examine relationships between people. Text Mining (TM) allows capturing the text they produce in Web 2.0 applications, for example, however it neglects their social structure. This paper applies an approach to combine the two methods named “content-based SNA”. Using the R mailing lists, R-help and R-devel, we show how this combination can be used to describe people’s interests and to find out if authors who have similar interests actually communicate. We find that the expected positive relationship between sharing interests and communicating gets stronger as the centrality scores of authors in the communication networks increase.

Angela Bohn , Ingo Feinerer , Kurt Hornik , Patrick Mair

CRAN packages used

tm.plugin.mail, car, tm, sna, igraph

CRAN Task Views implied by cited packages

NaturalLanguageProcessing, Optimization, SocialSciences, Bayesian, Econometrics, Finance, gR, Graphics, HighPerformanceComputing, Multivariate, Spatial


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".


For attribution, please cite this work as

Bohn, et al., "Content-Based Social Network Analysis of Mailing Lists", The R Journal, 2011

BibTeX citation

  author = {Bohn, Angela and Feinerer, Ingo and Hornik, Kurt and Mair, Patrick},
  title = {Content-Based Social Network Analysis of Mailing Lists},
  journal = {The R Journal},
  year = {2011},
  note = {},
  doi = {10.32614/RJ-2011-003},
  volume = {3},
  issue = {1},
  issn = {2073-4859},
  pages = {11-18}