@conference{6cbbf79e06d84c7ea3aef3b4453a4374,
title = "Assessing the quality of electronic health record data and patient self-report data",
abstract = "Knowing the accuracy of self-reported medical data is critical to using the data in clinical decision-making and research. The same is true for data in Electronic Health Records (EHRs). For these data, accuracy reported in the literature varies widely leaving little to guide researchers in selection of the most accurate data source. This study addresses this gap by comparing patient self-report and EHR data and is the most extensive study to date in the accuracy of clinical data. The study design, data collection and preliminary results for race data are reported here. The initial comparison of race data in a small group of participating clinics showed a 33% discrepancy rate. Further, bias was evident in that all of the discrepant records were from patients reporting Hispanic ethnicity. Initial characterization of the results identified process differences among the clinics and lack of identification with the race categories among patients. The extent of variability in discrepancy rates across facilities and other data elements remains to be characterized but the necessity for accuracy assessment has been demonstrated.",
keywords = "Data accuracy, Data quality, Data quality assessment, Electronic health record",
author = "Zozus, {Meredith N.} and Anita Walden and Marcia Byers and Thomas Powell and Pei Wang and Maryam Garza and {Del Fiol}, Guilherme and Jessica Tenenbaum and Matthew Nix and Carl Pieper",
note = "Funding Information: The importance of information quality in healthcare and health-related research is emphasized in Institute of Medicine (IOM) reports. (Davis, Nolan, et al. 1999; Dick, Steen et al. 1997; Stead and Lin 2009) The IOM (Davis, Nolan, et al. 1999) defines quality data as, “data that support the same conclusions as do error free data”. Three major national efforts including the Patient Centered Outcomes Research Institute (PCORI, www.pcori.org), the Agency for Healthcare Research and Quality funded Electronic Data Management (EDM) Forum (www.edm-forum.org), and the National Institute of Health funded Healthcare Systems Research Collaboratory (http://www.rethinkingclinicaltrials.org) have all emphasized data quality in research either through policy or funding solicitations. The importance of data quality in research has long been recognized in federally funded clinical studies (Bagniewska, Black et al. 1986; DuChene, Hultgren, et al. 1986; Greenberg 1967; Kronmal, Davis et al. 1978; McBride and Singer 1995), in industry trials conducted to support applications for marketing authorization, (Davis, Nolan, et al. 1999; SCDM 2013), in clinical registries (Arts, de Keizer, et al. 2002, Gliklich and Dreyer 2010), and recently in clinical studies relying on secondary use of healthcare data (NIH 2013; Zozus, Hammond, et al. 2014). The latter lists requirements for data quality in the solicitation with the goal of assuring that investigators demonstrate that data are capable of supporting research conclusions. The recent increased emphasis on research reproducibility and replication only heightens awareness and interest. (Baker 2015; Collins and Tabak 2014; Freedman, Cockburn, et al. 2015; Ioannidis 2005; NATURE 2014; Plant and Parker 2013; Young, Karr, et al. 2011; Young and Miller 2014) With the almost ubiquitous adoption of Electronic Health Records (EHRs) in hospitals in the United States and office-based clinics not far behind, the aforementioned national efforts and the National Institutes of Health (NIH) funded Clinical and Translational Science Award (CTSA) program (www.ncats.nih.gov/ctsa), there is a large emphasis on secondary use of EHR data for research. Because routine care is selective in the information documented, interest has also increased in patient self-reported information as an alternate data source and to supplement routine care data. All of the initial seven trials conducted on the Healthcare Systems Research Collaboratory relied on EHR data and six of the seven augmented the EHR data with patient self-reported data. Together, these two data sources, patients and electronic health records, hold great promise for increasing the efficiency, generalizability and cost effectiveness of clinical research. These potential benefits, however, are dependent on the capability of the data to support research conclusions, and thus on data quality assessment. Unfortunately, there is little generalizable knowledge about the quality of EHR and patient self-reported data. Funding Information: This research is funded through a contract (ME-1409-22573) from the Patient Centered Outcomes Research Institute (PCORI). Early work on the acquisition and linkage of the EHR data was supported by a gift to Duke University from the David H. Murdock Institute for Business and Culture and by Duke University{\textquoteright}s institutional commitment to grants K99LM011128 and R00LM011128 from the National Institutes of Health (NIH) National Library of Medicine (NLM). The ideas and opinions expressed here are those of the authors and not necessarily those of PCORI, the National Institutes of Health or any other organization. Publisher Copyright: {\textcopyright} 2017 MIT Information Quality Program. All rights reserved.; 22nd MIT International Conference on Information Quality, ICIQ 2017 ; Conference date: 06-10-2017 Through 07-10-2017",
year = "2017",
language = "English (US)",
}