Genetic and psychosocial predictors of aggression

Variable selection and model building with component-wise gradient boosting

Robert Suchting, Joshua L. Gowin, Charles E. Green, Consuelo Walss-Bass, Scott D. Lane

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Rationale: Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. Objectives: The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. Methods: The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. Results: From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with R2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4% of R2. Conclusions: Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for replication. This approach provides utility for the prediction of aggression behavior, particularly in the context of large multivariate datasets.

Original languageEnglish (US)
Article number89
JournalFrontiers in Behavioral Neuroscience
Volume12
DOIs
StatePublished - May 7 2018
Externally publishedYes

Fingerprint

Aggression
Wounds and Injuries
Statistical Models
Psychometrics
Sample Size
Smoking
Demography
Datasets
Research
Genes
tacrolimus binding protein 5
Machine Learning

Keywords

  • Aggression
  • Boosting
  • Data science
  • FKBP5
  • Machine learning
  • Psychopathy
  • Trauma

ASJC Scopus subject areas

  • Neuropsychology and Physiological Psychology
  • Cognitive Neuroscience
  • Behavioral Neuroscience

Cite this

Genetic and psychosocial predictors of aggression : Variable selection and model building with component-wise gradient boosting. / Suchting, Robert; Gowin, Joshua L.; Green, Charles E.; Walss-Bass, Consuelo; Lane, Scott D.

In: Frontiers in Behavioral Neuroscience, Vol. 12, 89, 07.05.2018.

Research output: Contribution to journalArticle

Suchting, Robert ; Gowin, Joshua L. ; Green, Charles E. ; Walss-Bass, Consuelo ; Lane, Scott D. / Genetic and psychosocial predictors of aggression : Variable selection and model building with component-wise gradient boosting. In: Frontiers in Behavioral Neuroscience. 2018 ; Vol. 12.
@article{7d1206128dd54108b88472c97db43500,
title = "Genetic and psychosocial predictors of aggression: Variable selection and model building with component-wise gradient boosting",
abstract = "Rationale: Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. Objectives: The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. Methods: The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. Results: From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with R2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4{\%} of R2. Conclusions: Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for replication. This approach provides utility for the prediction of aggression behavior, particularly in the context of large multivariate datasets.",
keywords = "Aggression, Boosting, Data science, FKBP5, Machine learning, Psychopathy, Trauma",
author = "Robert Suchting and Gowin, {Joshua L.} and Green, {Charles E.} and Consuelo Walss-Bass and Lane, {Scott D.}",
year = "2018",
month = "5",
day = "7",
doi = "10.3389/fnbeh.2018.00089",
language = "English (US)",
volume = "12",
journal = "Frontiers in Behavioral Neuroscience",
issn = "1662-5153",
publisher = "Frontiers Research Foundation",

}

TY - JOUR

T1 - Genetic and psychosocial predictors of aggression

T2 - Variable selection and model building with component-wise gradient boosting

AU - Suchting, Robert

AU - Gowin, Joshua L.

AU - Green, Charles E.

AU - Walss-Bass, Consuelo

AU - Lane, Scott D.

PY - 2018/5/7

Y1 - 2018/5/7

N2 - Rationale: Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. Objectives: The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. Methods: The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. Results: From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with R2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4% of R2. Conclusions: Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for replication. This approach provides utility for the prediction of aggression behavior, particularly in the context of large multivariate datasets.

AB - Rationale: Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. Objectives: The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. Methods: The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. Results: From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with R2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4% of R2. Conclusions: Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for replication. This approach provides utility for the prediction of aggression behavior, particularly in the context of large multivariate datasets.

KW - Aggression

KW - Boosting

KW - Data science

KW - FKBP5

KW - Machine learning

KW - Psychopathy

KW - Trauma

UR - http://www.scopus.com/inward/record.url?scp=85046690463&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046690463&partnerID=8YFLogxK

U2 - 10.3389/fnbeh.2018.00089

DO - 10.3389/fnbeh.2018.00089

M3 - Article

VL - 12

JO - Frontiers in Behavioral Neuroscience

JF - Frontiers in Behavioral Neuroscience

SN - 1662-5153

M1 - 89

ER -