Developing standards for improving measurement and reporting of data quality in health research

BACKGROUND: All data sets are flawed. Values may be missing where they are expected to be present, be present but contain values that do not represent reality, or be inconsistent compared with other values (eg, sex = male; diagnosis = pregnant). Currently, no standard practices and metrics exits to...

Full description

Bibliographic Details
Main Authors: Kahn, Michael, Ong, Toan (Author), Barnard, Juliana (Author), Maertens, Julie (Author)
Corporate Author: Patient-Centered Outcomes Research Institute (U.S.)
Format: eBook
Language:English
Published: [Washington, D.C.] Patient-Centered Outcomes Research Institute (PCORI) 2018, [2018]
Series:Final research report
Online Access:
Collection: National Center for Biotechnology Information - Collection details see MPG.ReNa
Description
Summary:BACKGROUND: All data sets are flawed. Values may be missing where they are expected to be present, be present but contain values that do not represent reality, or be inconsistent compared with other values (eg, sex = male; diagnosis = pregnant). Currently, no standard practices and metrics exits to describe data quality (DQ). As a result, current DQ processes are ad hoc and nontransparent to data users and consumers. OBJECTIVES: The goal of this project was to address the question, What aspects of DQ help users (eg, health researchers, patient advocates, and policy makers) and consumers (eg, patients and policy makers) have confidence in the results that are generated from a data set? This project focused on creating an agreed-on set of terminology definitions and recommendations to guide assessment and reporting of DQ findings. SPECIFIC AIMS: 1. To develop community-driven recommendations for DQ measurement and reporting2.
[link inactive] CONCLUSIONS: Data quality assessment remains a complex set of concepts, activities, computations, and reporting methods. A harmonized terminology unifies these disparate threads; a CDM for DQ measures brings technical computations under a single representation. Yet, the challenge of creating intuitive visualizations and easily interpreted reporting structures for technical and nontechnical stakeholders remains unsolved. LIMITATIONS: This work focused on general measures of DQ ("intrinsic DQ"). Yet most data users are interested in the fitness of a subset of elements needed for a specific study. The next phase of this work must include "fitness-for-use" DQ assessment and reporting methods
To define a DQ common data model (CDM) for storing DQ measures and to assess its viability within several large comparative effectiveness research (CER) data networks3. To create prototype DQ reports and visualizations that present results in an intuitive, informative, and understandable format to multiple patient-centered outcomes research (PCOR) stakeholders.4. To understand technical, professional, and policy barriers to increased DQ transparency METHODS: For specific aims 1 and 4, four stakeholder all-day face-to-face meetings plus monthly webinars and more than 10 national presentations allowed for continuous project engagement and assessment. We created 2 stakeholder communities: (1) patients, patient advocates, and health care policy makers; and (2) data stewards, informatics professionals, and clinical investigators.
For specific aims 2 and 3, a 2.5-day in-person DQ Code-A-Thon brought together programmers and data users to create DQ visualization prototypes, which we used during the second set of stakeholder meetings. RESULTS: Four separate publications capture the conclusions produced by the community (Section I): (1) a harmonized terminology for describing DQ dimensions, (2) recommendations for reporting DQ results, (3) an evaluation of more than 11 000 DQ checks from 8 networks, and (4) a survey exploring professional and organizational barriers to performing DQ assessment and reporting results. In addition, we created multiple DQ visualizations and a technical specification for a CDM for storing DQ results. All publications and technical materials are freely available through links on the Data Quality Collaborative website (http://repository.edm-forum.org/dqc/).
Physical Description:1 PDF file (73 pages) illustrations