Knowledge acquisition from and semantic variability in schizophrenia clinical trial data

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Recent federal requirements in the United States mandate sharing of research data, meaningful use of health information technology, and data standardization for regulatory review of marketed therapeutics. These requirements are predicated on the assumption that both healthcare organizations and the public will benefit from the enhanced secondary use of healthcare data. Because necessary standards are lacking across most clinical therapeutic areas, large-scale efforts are underway to create authoritative, consensus-based, and publically available standard data element sets. Knowledge acquisition is a key component of such efforts to improve information quality through decreasing semantic and syntactic variability in clinical data, i.e., data standardization. The extent and impact of semantic variability has not previously been rigorously assessed in clinical research. Such a characterization informs data standardization efforts and provides metrics to support data governance efforts. This article reports 1) evaluative data describing a potentially more scalable process for the knowledge acquisition, synthesis and definitional aspects of data element standardization and 2) characterizes the semantic variability component of information quality in data from pivotal clinical trials in schizophrenia. Semantic variability in clinical trials for Schizophrenia compounds recently reviewed for marketing authorization was substantial, implicating semantic variability as a key information quality problem in secondary use of clinical research data. Based on the relatively high proportion of data elements that the synthesis and clinical review process marked for deletion, an appreciable amount of the semantic variability was unnecessary. The form-based knowledge acquisition method used achieved 95% domain coverage as adjudicated by clinical experts and outperformed knowledge acquisition from experts. Within mental health, form-based knowledge acquisition appears to provide a feasible production scale for data element standardization.

Original languageEnglish (US)
Title of host publicationProceedings of ICIQ 2012
Subtitle of host publication17th International Conference on Information Quality
EditorsLaure Berti-Equille, Isabelle Comyn-Wattiau, Monica Scannapieco
Number of pages12
ISBN (Electronic)9781627483964
StatePublished - 2012
Externally publishedYes
Event17th International Conference on Information Quality, ICIQ 2012 - Paris, France
Duration: Nov 16 2012Nov 17 2012

Publication series

NameProceedings of ICIQ 2012: 17th International Conference on Information Quality


Conference17th International Conference on Information Quality, ICIQ 2012


  • Clinical research
  • Data elements
  • Data governance
  • Data quality
  • Data standards
  • Information quality
  • Knowledge acquisition

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Information Systems


Dive into the research topics of 'Knowledge acquisition from and semantic variability in schizophrenia clinical trial data'. Together they form a unique fingerprint.

Cite this