TY - JOUR
T1 - Can Twitter be used to predict county excessive alcohol consumption rates?
AU - Curtis, Brenda
AU - Giorgi, Salvatore
AU - Buffone, Anneke E.K.
AU - Ungar, Lyle H.
AU - Ashford, Robert D.
AU - Hemmons, Jessie
AU - Summers, Dan
AU - Hamilton, Casey
AU - Schwartz, H. Andrew
N1 - Publisher Copyright:
© 2018 Curtis et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2018/4
Y1 - 2018/4
N2 - Objectives The current study analyzes a large set of Twitter data from 1, 384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county. Methods Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis. Results Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. 'ready gettin leave') can explain much of the variance associated between socioeconomics and excessive alcohol consumption. Conclusions Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.
AB - Objectives The current study analyzes a large set of Twitter data from 1, 384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county. Methods Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis. Results Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. 'ready gettin leave') can explain much of the variance associated between socioeconomics and excessive alcohol consumption. Conclusions Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.
UR - http://www.scopus.com/inward/record.url?scp=85045039191&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85045039191&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0194290
DO - 10.1371/journal.pone.0194290
M3 - Article
C2 - 29617408
AN - SCOPUS:85045039191
SN - 1932-6203
VL - 13
JO - PloS one
JF - PloS one
IS - 4
M1 - e0194290
ER -