Can Twitter be used to predict county excessive alcohol consumption rates?

Brenda Curtis, Salvatore Giorgi, Anneke E.K. Buffone, Lyle H. Ungar, Robert D. Ashford, Jessie Hemmons, Dan Summers, Casey Hamilton, H. Andrew Schwartz

Research output: Contribution to journalArticlepeer-review

56 Scopus citations

Abstract

Objectives The current study analyzes a large set of Twitter data from 1, 384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county. Methods Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis. Results Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. 'ready gettin leave') can explain much of the variance associated between socioeconomics and excessive alcohol consumption. Conclusions Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.

Original languageEnglish (US)
Article numbere0194290
JournalPloS one
Volume13
Issue number4
DOIs
StatePublished - Apr 2018
Externally publishedYes

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'Can Twitter be used to predict county excessive alcohol consumption rates?'. Together they form a unique fingerprint.

Cite this