Clinical research informatics

Clinical research informatics (CRI) is a subdomain of biomedical and health informatics that focuses on the application of informatics to the discovery and management of new knowledge relating to health and disease. It includes management of information related to clinical trials, and also involves informatics related to secondary research use of clinical data. Clinical research informatics and Translational Bioinformatics are the primary domains related to informatics activities that support translational research[1].

  • 1 Background
  • 2 CRI related standards
  • 3 CRI historical development
  • 4.1 Additional efforts
  • 4.2 Related Articles
  • 5 Related concepts
  • 6 References
  • 7 External resources

The definition of CRI is in flux as it emerges as a subdiscipline. A 2009 definition focused CRI specifically on the domain of clinical research (human clinical trials and studies) but acknowledged that CRI also touches on the domain of translational research [2] (in medicine, translational research activities are those which precede and follow human clinical research activities; sometimes referred to as "bench to bedside" and and "bedside to community," respectively]).

A 2012 definition, however, took a wider view, suggesting that CRI "...focuses on developing new informatics theories, tools, and solutions to accelerate the full translational continuum: basic research to clinical trials, clinical trials to academic health center practice, diffusion and implementation to community practice, and 'real world' outcomes"[3]. If this broader definition becomes widely adopted, CRI could merge with another emerging informatics subdomain, translational research informatics (TRI) .

CRI related standards

CDISC develops several standards. Some of them are adopted by FDA for regulatory submissions. Operational Data Mode ( ODM ) is the most generic CRI CDISC standard.

CRI historical development

CRI is rapidly evolving and growing, in part due to increasingly complex clinical research workflow and information management challenges[2]. Underlying reasons for this evolution and growth include:

  • The rapid pace of biomedical science and the need for advances in medicine, which create pressure for clinical research to be conducted in a timely and efficient manner and also produce high-quality results[2]
  • The associated need to make clinical care data available for secondary use in support of clinical research[2]
  • The use of sophisticated biomedical research techniques that generate massive and ever-growing data sets (aka Big Data )[4]
  • The need for computer programs and other tools that can evaluate, combine, and visualize these large quantities of data not only on supercomputers, but also on PCs and workstations[4]
  • Challenges presented by the regulatory requirements associated with conducting clinical studies, including a trend toward conducting clinical trials in community practice settings instead of large academic health centers (AHCs) [2]

In addition to these factors, CRI development has been accelerated by an increase in the scope and pace of clinical and translational science advancements funded by programs such as the National Institutes of Health's (NIH) Roadmap for Medical Research initiative[2]. Roadmap programs related to CRI include:

  • Clinical and Translational Science Awards (CTSAs) - support a national consortium of research institutions that work together to enhance the efficiency and quality of clinical and translational research nationwide[5]
  • Bioinformatics and Computational Biology - supports the National Centers for Biomedical Computing (NCBCs) , charged with paving an "information superhighway" dedicated to advancing medical research[4]

Applications of CRI

Interventional Research

Traditionally, one main area of focus in CRI is supporting clinical trials that aim to evaluate the intervention or treatment by randomized controlled trials (RCTs)[6]. With the recent widespread adoption of EHR systems, CRI may be able to better support other approaches to interventional research such pragmatic clinical trials (PCTs) [6]. In contrast to RCTs, pragmatic clinical trials aim to evaluate the effectiveness of new treatments and interventions in real-world conditions [7]. Research cohorts within PCTs are determined using patient features and/or clinical features identified through the EHR which may offer a more accurate representation of the true patient population [7]. Despite the need to track the efficacy of new treatments after adoption into practice, there are still many challenges with conducting PCTs. Some informatics hurdles include challenges with data integration across multiple databases, identification of appropriate population cohorts and standardization of disease severity and progression [6].

Observational Research

In addition to interventional research, clinical research informatics can also provide the necessary infrastructure to support for observational research efforts. Observational research objectives center around evaluating treatment and/or patient outcomes that occur as a result of routine healthcare delivery. [6]. This non-experimental approach to research can be designed to evaluate based on cohort or cross-sectional grouping and can evaluate outcomes prospectively and retrospectively [6]. Essential research activities such as cohort identification, quality measures and treatment outcomes, all rely on querying and extraction of data from the EHR. The uptick in adoption of Common Data Models, such as i2b2 and OMOP, have allowed for better data standardization across institutions and has resulted in more opportunities for research based on “real world data” (RWD) [8]. Real world data often refers to data derived from real-world setting such as during healthcare delivery or health-related applications on mobile devices [8]. In addition, the use of large research networks such as the National Patient-Centered Clinical Research Network (PCORnet) have allowed for more large-scale observational studies to be conducted across different research institutions. One notable research initiative extracted data from the PCORnet network to understand the relationship between antibiotics administration and growth patterns in children [9]

Cohort Discovery

Clinical trials routinely struggle to meet their patient recruitment targets, which can result in early trial termination [10]. Many clinical research informatics efforts have centered round improving patient identification to improve overall clinical trial participant recruitment. Improvements in phenotyping EHR data could be a promising approach to improving participant recruitment in clinical trials.[8] One recommendation published in 2018 from the Clinical Trials Transformation Institutive called for use of “electronic health record queries, ICD-9 and ICD-10 de-identified records and geo-targeting disease data” as an avenue to improve cohort discovery and ultimately increase participant recruitment [10].

Clinical Data Mining

In recent years, Machine Learning has become a prominent approach to mining clinical data for research purposes. In a one notable paper by Rajkomar et al, the authors implemented deep learning techniques to mine EHR data to search for basic research questions such as clinical outcomes, risk of death, quality of care and risk of readmission [11]. Despite these authors illustrating significant progress in applying Machine Learning to retrospective research, there are still significant challenges associated with using this type of AI approach to build predicative models for prospective research studies [8].

Additional efforts

Other major initiatives, programs, and activities related to CRI include:

  • AMIA Clinical Research Working Group - fosters interaction, discussion, and collaboration among individuals and groups involved or interested in the practice and study of CRI
  • Biomedical Research Integrated Domain Group (BRIDG)
  • Cancer Biomedical Informatics Grid (caBIG) - an initiative of the National Cancer Institute (NCI)
  • Clinical Data Interchange Standards Consortium (CDISC)

Related Articles

  • An electronic medical records system for clinical research and the EMR–EDC interface
  • Electronic medical records for clinical research: application to the identification of heart failure

Related concepts

  • Translational bioinformatics
  • Translational research informatics
  • American Medical Informatics Association (AMIA). Informatics areas: clinical research informatics [Online]. 2012 [cited 2012 Nov 25]; Available from: URL: http://www.amia.org/applications-informatics/clinical-research-informatics
  • Embi PJ, Payne PR. Clinical research informatics: challenges, opportunities and definition for an emerging domain. J Am Med Inform Assoc 2009;16:323, 325.
  • Kahn MG, Weng C. Clinical research informatics: a conceptual perspective. J Am Med Inform Assoc 2012 Apr [cited 2012 Nov 25]; 19(e1):[e36-42]. Available from: URL: http://jamia.bmj.com.liboff.ohsu.edu/content/19/e1/e36.full.pdf+html
  • National Institutes of Health (NIH). Common fund makes new FY2010 wwards for National Centers for Biomedical Computing [Online]. [cited 2012 Nov 25]; Available from: URL: https://commonfund.nih.gov/bioinformatics/overview.aspx
  • NIH National Center for Advancing Translational Sciences (NCATS). Clinical and Translational Science Awards [Online]. [cited 2012 Nov 25]; Available from: URL: http://www.ncats.nih.gov/research/cts/ctsa/ctsa.html
  • Richesson RL, Horvath MM, Rusincovitch SA. Clinical research informatics and electronic health record data. Yearb Med Inform. 2014;9(1):215-23.
  • Kalkman S, van Thiel G, Grobbee DE, van Delden JJM. Pragmatic clinical trials: ethical imperatives and opportunities. Drug Discov Today. 2018;23(12):1919-21.
  • Solomonides A. Review of Clinical Research Informatics. Yearb Med Inform. 2020;29(1):193-202.
  • Huang GD, Bull J, Johnston McKee K, Mahon E, Harper B, Roberts JN. Clinical trials recruitment planning: A proposed framework from the Clinical Trials Transformation Initiative. Contemp Clin Trials. 2018;66:74-9.
  • Rajkomar A, Oren E, Chen K, Dai AM, Hajaj N, Hardt M, et al. Scalable and accurate deep learning with electronic health records. NPJ Digit Med. 2018;1:18
  • Block JP, Bailey LC, Gillman MW, Lunsford D, Boone-Heinonen J, Cleveland LP, et al. PCORnet Antibiotics and Childhood Growth Study: Process for Cohort Creation and Cohort Description. Acad Pediatr. 2018;18(5):569-76.

External resources

  • AMIA CRI Year-in-Review - contains information about the Clinical Research Informatics (CRI) Year-In-Review sessions that Peter Embi has conducted at the conclusion of the annual AMIA Summits on Translational Science since 2011
  • Center for Clinical and Translational Sciences
  • Clinical Research Informatics - Rachel L. Richesson, James E. Andrews, editors
  • eMERGE Network ( e lectronic me dical re cords & ge nomics) - national consortium organized by the National Human Genome Research Institute (NHGRI) to develop, disseminate, and apply approaches to research
  • Institute of Translational Health Sciences
  • June 2012, Volume 19, Issue e1
  • December 2011, Volume 18, Suppl 1
  • NIH National Center for Advancing Translational Sciences (NCATS)
  • The NIH Roadmap - Science, Volume 302, Oct 3 2003, 63-72
  • CRIwiki - a wiki dedicated to topics in CRI
  • Science Translational Medicine - journal promoting "communication and cross-fertilization among basic, translational, and clinical research practitioners and trainees from all relevant established and emerging disciplines"
  • IMIA Yearbook - international journal that features a section on CRI every year

Submitted by Deb Woodcock

Applications of CRI submitted by Marisa Capachietti Lopez

  • BMI512-FALL-12
  • BMI512-FALL-20

Navigation menu

Personal tools.

  • Log in / create account
  • View source
  • View history

Search

  • About clinfowiki
  • Recent changes
  • Random page
  • What links here
  • Related changes
  • Special pages
  • Printable version
  • Permanent link
  • Page information
  • This page was last modified on 2 November 2020, at 17:21.
  • This page has been accessed 56,723 times.
  • Content is available under GNU Free Documentation License 1.2 unless otherwise noted.
  • Privacy policy
  • About Clinfowiki
  • Disclaimers

GNU Free Documentation License 1.2

clinical research informatics examples

NLM Musings from the Mezzanine

Innovations in Health Information from the National Library of Medicine

Appreciating the Distinction: Clinical Informatics Research vs. Clinical Research Informatics

Guest post by Allison Dennis, PhD, Program Officer for the Division of Extramural Programs, National Library of Medicine.

The National Library of Medicine Division of Extramural Programs (EP) oversees NLM’s extramural research investments. One area that NLM invests in is using informatics methodologies and tools to understand and improve the way health care is delivered and health overall. In the ever-evolving health care landscape, the intersection of technology and research plays a pivotal role in advancing patient care and outcomes.

Two closely related yet different domains within this intersection are Clinical Informatics Research and Clinical Research Informatics. While their names may sound similar, these fields encompass different foci and methodologies. NLM funds research in both of these areas as they contribute to our mission in distinct ways.

Clinical Informatics Research

Clinical Informatics Research involves the study of information management and technology applications within health care settings. This research focuses on optimizing the use of information to improve patient care, streamline health care processes, and enhance overall system efficiency. Researchers in this field delve into topics such as electronic health records (EHRs), health information exchange, data interoperability, and the design and implementation of clinical decision support (CDS) systems. NLM’s interest in Clinical Informatics Research contrasts with that of other NIH institutes and centers because it seeks to unleash the broad potential of data and informatics to improve health care in general.

NLM supports many grants in the field of Clinical Informatics Research. These include NLM-supported research that is:

  • Establishing an informatics framework to improve and automate the referral process between primary care providers and specialists
  • Advancing automated documentation algorithms that interpret the dialogue between patients and clinicians, generate relevant encounter summaries, and develop new and improved ways to capture what happened during a visit
  • Developing an adaptive CDS system that can adapt to variations in clinician fatigue levels in emergency departments

These studies have considerable potential to make health care more efficient and equitable. Their innovative data science and informatics approaches may improve health care transitions for patients, alleviate clinicians’ documentation burden, offer new ways to study implicit bias in clinical encounters, provide foundational knowledge for creating health information technology that adapts to clinicians and patients, and inform guidelines for the design and implementation of more responsive systems. Through investments in Clinical Informatics Research projects like these and the new and exciting ways they can improve health care delivery, NLM is excited to continue advancing its strategic priorities for enabling a future of data-powered health.

Clinical Research Informatics

Clinical Research Informatics is centered around leveraging informatics methodologies to enhance the research processes by introducing new paradigms for discovery and knowledge management. This field aims to innovate how data are harnessed to characterize, predict, prevent, diagnose, and treat disease more efficiently and accurately. As such, many NIH institutes and centers invest in Clinical Research Informatics research as it relates to particular health topics. However, NLM is uniquely interested in projects that provide broad and generalizable insights applicable to and relevant across disease domains.

Some of the many NLM-supported Clinical Research Informatics projects are:

  • Tailoring innovative information retrieval methods to handle complex EHR data for cohort discovery
  • Improving the generalizability of clinical trial findings to real-world populations through the development of new causal and statistical methods that address biases in cluster trials
  • Creating risk prediction models that can learn from medical codes commonly found in the EHR and reduce the need for annotation from experts

These studies have the potential to enhance the data-driven capabilities of health-related research. They are developing domain-independent and reusable methods for leveraging data and models to design stronger clinical trials, better understanding and applying the knowledge generated from clinical trials in the real world, and using EHRs for precision medicine research. Through investments in Clinical Research Informatics, including projects like these, NLM continues to advance its strategic priorities for enabling a future of biomedical discovery.

Appreciating the Distinction, Funding Both

Both Clinical Informatics Research and Clinical Research Informatics play important roles in advancing health care delivery, improving patient outcomes, and driving innovation in health care. On one hand, Clinical Informatics Research focuses on optimizing the use of information technology in clinical settings to enhance workflow efficiency, patient safety, and communication among health care professionals. On the other, Clinical Research Informatics enables the efficient collection, analysis, and interpretation of vast amounts of data, facilitating evidence-based decision-making, and the development of new treatments and interventions. Together, these interdisciplinary fields contribute to the continuous evolution of health care practices, ultimately leading to better patient care and health outcomes. NLM is committed to supporting both Clinical Informatics Research and Clinical Research Informatics studies that elucidate and address the complex challenges facing modern health care systems and ensure the delivery of high-quality, patient-centered, and data-informed care.

Are You a Researcher with Innovative Ideas?

We encourage researchers interested in advancing data-driven capabilities and developing novel approaches to Clinical Informatics Research and Clinical Research Informatics to consider applying for NLM Research Grants in Biomedical Informatics and Data Science . Please reach out to an NLM Program Officer . We are always happy to discuss the scope of a potential project and appreciate the opportunity to review  draft specific aims .

Now is the perfect time to become part of the NLM-supported community that is creating the cutting-edge technologies needed to improve patient care and enhance health care delivery while advancing our ability to study and understand human health.

clinical research informatics examples

Allison Dennis, PhD

Dr. Dennis serves as the scientific contact for Bioinformatics, Translational Informatics, Personal Health Informatics, and the SBIR/STTR program in the NLM Extramural Research Program. Prior to joining NLM, Dr. Dennis was a Technical Lead in the NIH Office of Data Science Strategy, where she oversaw initiatives in artificial intelligence, and a Health Informatics Officer with the Office of the National Coordinator for Health IT, where she advanced health IT standards for scientific discovery. Dr. Dennis holds a PhD in Biology from Johns Hopkins University. She has nearly a decade of experience conducting data-driven biomedical research in the NIH Intramural Research Program.

Related posts:

Discover more from nlm musings from the mezzanine.

Subscribe now to keep reading and get access to the full archive.

Type your email…

Continue reading

  • Members-only
  • AMIA Connect
  • Knowledge Center
  • ACI Journal

Clinical Research Informatics

The Clinical Research Informatics Working Group's mission is to advance the discipline of Clinical Research Informatics (CRI) by fostering interaction, discussion and collaboration among individuals and groups involved or interested in the practice and study of CRI, and to serve as the home for CRI professionals within AMIA.

Clinical Research Informatics (CRI), as a subdomain of biomedical informatics, encompasses the technology and processes and principles and practices to support the breadth of activities included in the execution of clinical research involving human subjects and their data.

Community Access

community-icon

Members Only

Access the Clinical Research Informatics Working Group on AMIA Connect.

Not a member? Learn more about exclusive member benefits and join today

CRI includes:

  • Selection, implementation, development, and maintenance of a technology ecosystem to support clinical research activities and associated regulatory needs
  • Optimization of electronic health record (EHR) systems and data to support research administration, participant recruitment and consenting, data capture, intervention implementation and other research activities supporting clinical research execution
  • Management and workflow of data in research data repositories, data registries, data marts, data warehouses, and electronic data capture along with leveraging and simplifying the process of leveraging these standardized repositories via reporting and analytics
  • Implementation science methods and translation of research into evidence-based practice
  • Standardization of tools, techniques, and processes to support reproducibility of clinical research results, and outcomes
  • Providing informatics tools to address the ethical, legal, and social issues that effect clinical research

The Clinical Research Informatics Working Group's mission is to advance the discipline of CRI by fostering interaction, discussion, and collaboration among individuals and groups involved or interested in the practice and study of CRI, and to serve as the home for CRI professionals within AMIA. The goals of the CRI Working group are to:

  • Increase awareness and interaction of CRI domain with the various subdomains that it encompasses
  • Provide a forum for discussion, collegial development, exchange, and information dissemination
  • Identify, provide assistance in resolution of common issues in the CRI domain and form ad hoc groups for discussion
  • Provide guidance, and engage community and discussion regarding the regulatory aspects of CRI, collect feedback and share/report the findings / understanding / consensus
  • Provide education to AMIA members on the various aspects of CRI and its overlap with various domains/sub domains
  • Provide support for members in their institutional settings to advance clinical research informatics
  • Educate and inform the lay public regarding clinical research informatics to support enhanced people-centered research
  • Provide platform and facilite discussions for CRI Leaders e.g. CRIO Discussion Forum

Profile image for Yasir Tarabichi, MD, MSCR

Yasir Tarabichi, MD, MSCR

Profile image for Thomas Kingsley, MD MPH MS

Thomas Kingsley, MD MPH MS

Profile image for Ann Wieben, PhD, BSN RN NI-BC

Ann Wieben, PhD, BSN RN NI-BC

Profile image for Deepak Gupta, MD, MS

Deepak Gupta, MD, MS

Profile image for Dorothy Bouldrick, Doctor of Health Administration

Dorothy Bouldrick, Doctor of Health Administration

Profile image for Humayera Islam, MS

Humayera Islam, MS

Profile image for James Blum, MD

James Blum, MD

James mcclay, md.

Profile image for Lillie Dash, M.S. Health Informatics

Lillie Dash, M.S. Health Informatics

Satya sahoo, phd, famia.

Profile image for Yun Jiang, PhD, MS, RN, FAMIA

Yun Jiang, PhD, MS, RN, FAMIA

Kavishwar wagholikar, md, phd.

  • Performing: Working Group has high level of engagement and output (workshops, papers, webinars)
  • Networking: Working Group has internal and external networking opportunities for members (mentorship programs, social events, collaboration)
  • Developing: New Working Group or revitalizing efforts to grow membership (recruitment efforts, leadership)

card image

Featured Publication

Clinical research informatics - third edition.

Next Application Deadline: June 7

Home > Resources > Health Informatics > What is Clinical Informatics?

What is Clinical Informatics?

two people in healthcare setting working in clinical informatics

  • Published February 16, 2017
  • Updated June 28, 2023

Clinical informatics is the study of information technology and how it can be applied to the healthcare field. It includes the study and practice of an information-based approach to healthcare delivery in which data must be structured in a certain way to be effectively retrieved and used in a report or evaluation. Clinical informatics can be applied in a range of healthcare settings including hospitals, physician’s practices, the military, and other locations.

Clinical Informaticist Job Duties

Providers in today’s healthcare industry increasingly rely on data and technology to provide treatments for patients. Physicians, nurses, dentists, pharmacists , rehab therapists, assistants and a host of others collect and share data to formulate and implement a treatment plan for a patient. Along the way, they use the latest in technological equipment, computers, software, tablets, smartphones and even apps to gather and distribute information. All of this information must be collected, stored, interpreted, analyzed and implemented into a treatment plan.

A clinical informaticist may serve in a multitude of roles, depending on the size of the healthcare setting. Typically, these professionals evaluate the existing information systems and recommend improvements to functionality. Clinical informaticists may study a data entry or visual image storage system or interact with those who need access to records. They may train staff on system use, build interfaces, troubleshoot software and hardware issues, and work across multiple departments to integrate the sharing of information. They document and report their findings and work to implement improvements. The ultimate goal is to manage the costs while improving patient outcomes.

Clinical Informatics Jobs Outlook

Though clinical informatics has been widespread in the healthcare industry since the 1970s, a stimulus bill passed by Congress in 2009 included a mandate that medical providers convert paper records to electronic data by 2014 to continue receiving Medicaid and Medicare payments.

The American Medical Informatics Association (AMIA) achieved one of its goals in 2011 when the American Board of Medical Specialties recognized clinical informatics as a subspecialty. The first board certifications were awarded late in 2013.

“This field is exploding,” Charles Friedman, director of the health informatics program at the University of Michigan-Ann Arbor, told U.S. News and World Report in 2014. “Access to health information on the Web is taking off at a meteoric pace. It’s creating enormous employment opportunities.”

Clinical Informatics Job Descriptions

In the growing field of clinical informatics, specialized roles are taking shape that focus on specific areas of healthcare, including positions in medical informatics, nursing informatics, pharmacy informatics, and nutrition informatics.

Medical Informatics

The U.S. National Library of Medicine defines medical informatics as the study of the design, development and adaptation of IT-based innovations in healthcare services delivery, management and planning. For example, when a patient goes for tests, medical informaticists ensure those results are quickly and securely accessible to doctors as part of the patient’s electronic health record (EHR) . This technology can be applied to payment systems and transactions through government agencies and insurance companies.

Nursing Informatics

Doctors and patients discussing treatment options rely on data. The nursing informatics role, as defined by the American Nurses Association, serves to integrate data, information and knowledge to support the decision-making process of patients and their providers. A nurse in this position knows how to store and access medical information and how to keep the facility’s IT systems up to date.

Pharmacy Informatics

When it comes to prescribing and administering medications, electronic communication is rapidly replacing the pen-and-prescription pad ways of the past. In this emerging field, a pharmacy informaticist uses both medical and computer knowledge to improve the efficiency and accuracy of the medication process for the patient and the providers.

Nutritional Informatics

Food- and nutrition-related information may be an important part of a patient’s treatment plan. Nutrition informaticists assist in the storage, organization and retrieval of data that will help dieticians, doctors and patients make informed choices in this continually evolving area. A person in this role might be working with software that would create a checklist of considerations based on a diagnosis, medications, allergies and other values.

Clinical Informatics Salary information

The impact of the federal mandate regarding EHRs is still being felt as many medical providers and facilities respond. The demand for applicants with both medical and technological knowledge is growing. Job growth in clinical informatics could reach about 21% through 2020, according to the U.S. Bureau of Labor Statistics.

The American Health Information Management Association (AHIMA) estimates mid-range salaries for those in clinical informatics roles to reach more than $85,000 a year. Managers in this profession may earn as much as $200,000. However, it is important to note that geographical location, position requirements, the employer and other factors may affect those numbers, so research is necessary.

Educational Requirements for Clinical Informatics Jobs

This field of study is most often pursued by healthcare professionals with a passion for technology. For example, nurses often transition into informatics through graduate-level informatics programs, which teach students how to build medical applications (EHRs), how to abide by patient privacy laws, and how to better understand healthcare policy and economics.

Clinical informaticists typically start with at least a bachelor’s degree and often earn a graduate degree that includes technical instruction combined with courses in medical practice and hands-on experience working with electronic patient records at healthcare facilities.

Many employers look for applicants with a Master’s degree in Health Informatics , Healthcare Management, or Quality Management.

Many colleges offer online video-based e-learning and interactive virtual classrooms, where students can gain the credentials to pursue a career in clinical informatics on their terms.

Related Articles

a graphic of a computer and a megaphone on the screen with the a word bubble that says quiz time!

QUIZ: How Well Do You Know e-Healthcare Ethics?

a young lady learning from the USF MSHI class on emedicine business models

Learning E-medicine Business Models

Quiz: test your knowledge of the cloud in healthcare, academic calendar, get our program guide, if you are ready to learn more about our programs, get started by downloading our program guide now..

How can we help you?

  • Research Areas
  • Funded Projects
  • NLP/LLM Interest Group
  • Degree Programs
  • Biomedical Informatics and Data Science Training Program
  • Postdoctoral Biomedical Informatics Fellowship at the VA
  • IMPAACT Fellowship
  • AI Recommended Reading

INFORMATION FOR

  • Residents & Fellows
  • Researchers

Biomedical Informatics & Data Science

Biomedical Informatics & Data Science engages new and existing faculty, students, and staff from all of Yale to promote equitable and sustainable health with informatics and data science.

  • We do top-notch research, linking across disciplines, and delivering in practical settings
  • We promote excellence in transdisciplinary training
  • We partner with organizations at Yale and beyond to create new initiatives, to write various grant proposals, to fund raise, and to impact global health
  • We make Precision Medicine a reality for all

About Biomedical Informatics & Data Science

This new academic unit at the intersection of health sciences and information technology develops new approaches to organize and analyze biomedical and healthcare data to promote health for all.

Jobs in Biomedical Informatics & Data Science

We have open faculty, staff, and postdoctoral positions. We are especially focused on using informatics to promote equity, diversity, inclusiveness and justice. If this sounds like you, please consider joining us.

Current Research Areas

What are we working on? Privacy-protecting Data Sharing, Distributed Analytics and AI Model Evaluation, Biomedical Data Index, and more.

Latest News

Yale study finds association between eczema and eating disorders, ai-based biomarker for aortic stenosis found by yale researchers, yale faculty present groundbreaking clinical research at the 2024 american college of cardiology scientific sessions.

If you are interested in collaborating or joining us, please contact us via email.

  • Department of Health and Human Services
  • National Institutes of Health

Clinical Center Logo

Department of Clinical Research Informatics

Clinical information systems.

The Department of Clinical Research Informatics (DCRI) provides technical, interface, and project management support for numerous clinical and clinical information systems department applications throughout the NIH campus.

Major Clinical Information Systems

Department of Perioperative Medicine

  • Perioperative Services Information System (POIS/SIS)

Department of Laboratory Medicine (DLM)

Laboratory Information System (SCC SoftLab)

Lab Requests

Diagnostic Radiology Department (DRD)

Radiology Information System (RIS)

Picture Archiving and Communications System (PACS)

Department of Transfusion Medicine (DTM)

Blood Bank Control System

Blood Bank System (SoftBank)

Materials Management Systems

Material Safety Data Sheets (MSDS)

Pyxis Supply Station System

Visual Supply Catalog

Medical Records Systems

3M Medical Record Processing System

Electronic Signature Application (3M ESA)

Transcription Application (3M ChartScript)

Nursing and Patient Care Services

Acuity Plus

Nutrition Department

Nutrition System (CBORD)

Pharmacy Department Systems

CRIS Pharmacy (SMM)

Drug Formulary

Micro Medix

Custom Software Applications

Clinical Trials (Search the Studies)

Executive Information System (EIS)

Hospital Services

Master Test Guide

NIH Applications

clinical research informatics examples

NOTE: PDF documents require the free Adobe Reader .

This page last updated on 01/26/2023

You are now leaving the NIH Clinical Center website.

This external link is provided for your convenience to offer additional information. The NIH Clinical Center is not responsible for the availability, content or accuracy of this external site.

The NIH Clinical Center does not endorse, authorize or guarantee the sponsors, information, products or services described or offered at this external site. You will be subject to the destination site’s privacy policy if you follow this link.

More information about the NIH Clinical Center Privacy and Disclaimer policy is available at https://www.cc.nih.gov/disclaimers.html

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Yearb Med Inform

Logo of ymi

Clinical Research Informatics for Big Data and Precision Medicine

1 Department of Biomedical Informatics, Columbia University, New York, NY 10032 USA

2 Department of Pediatrics, University of Colorado, Denver, CO 80045 USA

To reflect on the notable events and significant developments in Clinical Research Informatics (CRI) in the year of 2015 and discuss near-term trends impacting CRI.

We selected key publications that highlight not only important recent advances in CRI but also notable events likely to have significant impact on CRI activities over the next few years or longer, and consulted the discussions in relevant scientific communities and an online living textbook for modern clinical trials. We also related the new concepts with old problems to improve the continuity of CRI research.

The highlights in CRI in 2015 include the growing adoption of electronic health records (EHR), the rapid development of regional, national, and global clinical data research networks for using EHR data to integrate scalable clinical research with clinical care and generate robust medical evidence. Data quality, integration, and fusion, data access by researchers, study transparency, results reproducibility, and infrastructure sustainability are persistent challenges.

The advances in Big Data Analytics and Internet technologies together with the engagement of citizens in sciences are shaping the global clinical research enterprise, which is getting more open and increasingly stakeholder-centered, where stakeholders include patients, clinicians, researchers, and sponsors.

Introduction

Clinical Research Informatics (CRI), a recently defined subfield of biomedical informatics that focuses on informatics support for medical evidence generation [ 1 ], has continued to enlarge its scope and importance in supporting the broadening agendas in clinical and translational sciences [ 2 ]. Over the past decade, accelerating at an explosive pace, biomedical research has moved into the era of massive-scale digitalization of data and computationally-intensive quantitative analytics spanning molecular, clinical, and population-level data and including measuring events from picoseconds to decades-long time scales. Novel digital devices, from high-throughput next generation deep sequencing machines to continuous real-time bio-sensing tattoos [ 3 ], continue to challenge the CRI community to develop new infrastructure capacities in addition to data and knowledge discovery tools that can handle petabyte-size data stores. The “Information Commons” highlighted in the IOM report on Precision Medicine is envisioned to integrate vast amounts of data with constantly evolving biomedical knowledge [ 4 ]. The informatics underpinnings that enable and accelerate this transition to large-scale integrated data and knowledge systems has to respond with innovations across the CRI spectrum. At the same time, research and discovery at the scale that is technically possible presents new challenges, not only to CRI but also to data sharing and privacy policies, and the regulatory bodies that must respond to this rapidly changing data-driven agenda [ 5 ].

In 2012, we presented a conceptual model intended to capture the CRI landscape of activities and challenges to contextualize eighteen new publications in a special supplement of the Journal of the American Medical Informatics Association focused exclusively on CRI research results and innovations [ 6 ]. The central thesis of our model was CRI’s unique role in enabling “informatics-enabled clinical research workflow” and the methods and tools needed to support early-stage translational discoveries and later-stage evidence generation and synthesis, personalized evidence application and populations surveillance. We ended that publication with the following predictions:

“We expect the CRI research agenda will continue to evolve to become more precise, predictive, preemptive, and participatory, in parallel with the development of P4 medicine. We anticipate more patient-centered research decision support and innovative consent programs to strengthen patient participation, including specifying how an individual’s research data will be used and by whom. We also expect more CRI research that is informed by and responsive to patient or population needs.”

We revisited our 2012 conceptual model and examined current advances in CRI against that model, adding new elements where needed and modifying those that have evolved. We evaluated our predictions from four years ago and updated them to reflect both the anticipated and unanticipated shifts in the translational research landscape and their impact on CRI in 2016 and into the immediate future.

We did not conduct an exhaustive formal literature review. Instead, we selected notable publications and events based on our personal weighing of their importance and by referring to the public expert opinions on the Internet, such as Dr. Peter Embi’s “CRI Year in Review” ( http://www.embi.net/cri.html ) and “Rethinking Clinical Trials” provided by Duke University ( http://sites.duke.edu/rethinkingclinicaltrials/ ), and the discussions within the AMIA CTSA community. We summarized our understanding of the state of the art and the recent trends in CRI and described them below.

Figure 1 illustrates our updated conceptual model of the state-of-the-art CRI methods and issues. Comparing this model to the one that we previously presented in year 2012 [ 6 ], changes have occurred in the overall workflow, the underlying data sources, and the CRI foundational components. New workflow components include the addition of Big Data Sciences as a source of new research questions and the expansion of evidence generation and synthesis by including evidence appraisal. Evidence appraisal involves critical and systematic review of medical evidence to judge its trustworthiness, value and applicability in a particular context. For example, it uncovers potential biases in clinical research participant selection and examines factors such as internal validity, generalizability and relevance [ 7-10 ]. It is particularly relevant given the rapid growth of new high-throughput data analytics and hypothesis generation methods that give rise to more controversial findings than ever [ 11-13 ]. New data sources are acknowledged in our conceptual framework with the addition of wearable devices, patient-reported outcomes, social media, and environmental sensors. We anticipate that this list of electronic data sources will continue to grow as portable, wearable, and always connected devices become more widely used. The largest changes are reflected in the core informatics foundational components that now include Big Data Analytics, Data Fusion, Workflow Support, and Phenotyping using electronic patient data. In addition, record linkage has been generalized to data linkage, information extraction now includes natural language processing and text mining, and knowledge management has been expanded to knowledge engineering. All these additions are preparing the CRI community to better advance the Precision Medicine [ 4 ] and Learning Health System [ 14 , 15 ] agendas.

An external file that holds a picture, illustration, etc.
Object name is ymi-11-0211-g001.jpg

Our conceptual framework of the field of clinical research informatics updated/expanded from [ 6 ].

1 The Arrival of Big Data

The National Academy of Medicine (the former Institute of Medicine) predicted back in 2003 that the wide adoption of electronic health record (EHR) systems would eventually enable the collection and aggregation of large amounts of electronic patient data to facilitate clinical decision support and accelerate evidence generation [ 16 ]. With the continued widening adoption of EHR systems globally, this prediction has been partially realized. According to the latest statistics, 75% of the hospitals in the United States have adopted at least one basic EHR system and the adoption rate is still steadily rising [ 17 ]. Less successful has been the broad adoption of real-time clinical decision support and rapid-cycle evidence generation as envisioned by Embi [ 18 ].

The Big Data acquired by EHRs enables us to pursue a long-sought vision, a rapid learning healthcare system that integrates clinical research and clinical care, where clinical data are a basic staple of health learning [ 14 , 15 , 19 , 20 ], and enables large-scale observational studies and large pragmatic trials for rapid evidence generation and validation using EHR data [ 21 , 22 ]. Citizen engagement is central to the success of this learning health system [ 23 ]. Regional or national learning health systems, such as PaTH and PEDSnet, have been developed based on enterprise data warehouses or clinical data research networks [ 24-28 ]. These working examples help researchers continue to make the functionalities of learning health systems more specific and concrete [ 29-31 ].

Clearly, Big Data has been recognized as the foundation of a learning health system and a catalyst for optimizing clinical research design [ 32 ]. In order to harness Big Clinical Data increasingly made available by EHRs, numerous clinical data research networks, loosely coupled or tightly coupled, have been developed across the world. In the United States, the National Center for Advancing Translational Sciences (NCATS) has established the Accrual to Clinical Trials (CTSA ACT) network (https://ncats.nih.gov/pubs/features/ctsa-act). The Patient Centered Outcomes Research Institute (PCORI) has launched the PCORnet [ 33 ], which includes thirteen clinical data research networks and nineteen patient-powered research networks that cover most of the states in the United States (USA) to conduct both randomized trials and observational comparative effectiveness studies using EHR data. These networks expand existing large-scale data sharing networks such as the CDC-sponsored Vaccine DataLink [ 34 ], Health Care Systems Research Network (formerly called the HMORN) [ 35 ], FDA-sponsored Mini-Sentinel drug surveillance network [ 36 ] and AHRQ-sponsored large-scale platforms that support multi-institutional comparative effectiveness research [ 37 , 38 ]. In September 2015, the Electronic Medical Records and Genomics (eMERGE) network sponsored by the National Human Genomics Research Institute [ 39 ] also embarked on its third phase of research with a particular focus on returning actionable pathogenic genetic variants to patients and families via genomic decision support in clinical care settings using EHRs or personal health records across 9 participating sites in the USA.

In Europe, the large-scale EHR4CR project has entered its 5 th year as the flagship project for using EHR data for accelerating clinical trials [ 40 ]. A recent cost-effectiveness study of using EHR data for clinical trials based on the EHR4CR project has suggested that optimizing clinical trial design and execution with the EHR4CR platform would generate substantial added value for pharmaceutical industry, as the main sponsors of clinical trials in Europe, and beyond [ 41 ]. Similarly, the EU-ADR project has established a multi-national drug safety surveillance system [ 42 ].

Internationally, a global collaborative network called The Observational Health Data Sciences and Informatics (OHDSI) has enabled large-scale evidence aggregation using more than 680 million patients’ electronic data [ 43 ] and helped shape the emerging networked science for biomedical research based on interoperable data [ 44 ].

Although the proliferation and success of such networks are exciting, we should not forget the lessons learned from previously heavily investigated but later abandoned or terminated large-scale research networks due to troubles from impractical project goals and study designs, ineffective management, and failed oversight, such the National Children’s Study network [ 45 , 46 ] and The Cancer Bioinformatics Grid (caBIG) [ 47 ] program in the USA. The sustainability of large-scale data infrastructures remains a largely unresolved issue and a primary concern of the CRI community, who is heavily involved in the development and maintenance of such large and complex systems [ 48 , 49 ]. Unlike most tightly-coupled networks that operate by external funding support, as a loosely-coupled network, OHDSI shows unusual sustainability promise in that only the very early experimental sites received funding and nearly all of the existing data partners continue to be active in the OHDSI Collaborative without depending solely upon designated external funding, illustrating the resilience of open community-based collaborations rather than the brittleness of top-down centralized collaborations. This phenomenon has been described by others who have achieved sustained adoption of widely-deployed CRI tools [ 50 , 51 ].

The growing availability of Big Clinical Data promises to accelerate drug discovery [ 52 ]. It has also accelerated the knowledge engineering of reproducible and portable computerized clinical phenotypes [ 53 ] in the hope of using standards-based algorithms for achieving interoperability in genome-wide association studies and phenome-wide association studies of determinants of disease risks and for clinical study cohort identification and recruitment. Big Data comes from not only EHRs, but also sensors, wearable devices, and consumer-generated Big Data in social media [ 54 ] such as Facebook [ 55 ] and Twitter [ 56 ]. Weber et al. highlights that EHR data reflects only a small portion of data relevant to understanding the full context of health and disease [ 57 ]. A recently published JAMA article pointed out that Twitter streams can be very effective for public health surveillance [ 56 ]. To support Big Data sciences, the National Institutes of Health also launched the Big Data to Knowledge (BD2K) Initiative [ 58 , 59 ] and funded a series of Centers of Excellence for using Big Data Analytics as well as a range of training and curriculum grants to train the next generation data scientists [ 59 ].

2 Advances in Big Data Analytics

The era in which the integration, fusion, or linkage of data across the biological, clinical, patient, and environmental spectrums is increasingly common has arrived. Ma et al. linked public clinical trial summaries with a medical encyclopedia to identify questionable exclusion criteria [ 7 ]. Lorgelly et al. linked cancer data with commonwealth reimbursement data to infer which patient, disease, genomic and treatment characteristics explain variation in health expenditure [ 60 ]. Data integration is also called data aggregation. It is a process where data of the same type from multiple sources, such as EHR data from multiple institutions are integrated in a shared central data warehouse [ 61 ]. It is used often to improve sample size for clinical research. In contrast, data fusion emphasizes arriving at improved understanding using different but complementary data about the same object [ 62 , 63 ]. For example, Wu et al. developed a multi-omic data fusion approach to map the crosstalk between metabolic phenotype and microRNA data to understand the systemic consequences Rouxen-Y gastric bypass surgery [ 64 ]. The major challenge for data integration is semantic interoperability, whereas the primary challenges facing data fusion include integration of data and knowledge representation from multiple different yet complementary perspectives and with different granularities and resolutions.

One of the most striking developments in recent years has been the massive expansion in data storage capabilities, driven by cloud-based technologies developed to support the enormous data needs of search and e-commerce vendors, such as Google (now Alphabet) and Amazon, and rapidly adopted in biomedicine and health sciences [ 65-67 ]. With near limitless storage, data previously difficult to access, such as geographic, climate, economic and social media data sets, can now be linked to enable population-based analytics that have never before been possible. For life sciences research and CRI professionals, these new data storage and data retrieval architectures, coupled with on-demand, scalable computational resources, has enable petabyte-size data sets to be stored and shared for worldwide access and analysis. For example, Amazon and Google provide access to The Cancer Genome Atlas, The International Cancer Genomic Consortium, 1000 Genomes Project, and 3000 Rice Genome data sets on their cloud services ( http://aws.amazon.com/public-data-sets/ ). Google has a similar library of large published data sets that can be accessed worldwide ( http://google-genomics.readthedocs.org/en/latest/use_cases/discover_public_data/genomic_data_toc.html ). In this respect, the public sharing of clinical data remains far behind the sharing of genomic data, due to widespread concerns about the growing ability to re-identify individuals [ 68 ]. An increasing body of literature suggests re-identification risks also exist with genomic data [ 69 ].

With massive data sets that are far too large or too complex to analyze using traditional local computational methods, new approaches for performing analytics using distributed techniques that “bring analytics to the data” rather than “submit data to the analytics” are a new area of active CRI research [ 70-74 ]. These have also been adapted to enable distributed analytics of sensitive clinical data without requiring data partners to release any patient-level data, offering a new approach for reducing concerns about patient re-identification. One set of tools that continue to evolve slower than anticipated are systems that automate semantic harmonization and annotation for data and knowledge integration [ 75 ]. Current methods remain difficult to use, mostly relying on human annotation. The promise of semantic web technologies has not materialized for general use although some striking examples show the potential of these methods [ 76 , 77 ].

3 Reproducibility, Generalizability, and Ethical Implications of Big Data

Research based on reuse of clinical data is frequently questioned for reproducibility [ 78 ]. Publishers and scientists have increasingly recognized the importance of sharing data for improving reproducibility [ 79 ]. The Scientific Data ( http://www.nature.com/sdata/ ) journal was launched this year in response to the rising need to help scientists permanently archive, share, and disseminate valuable research data. It is foreseeable that more journals will start accommodating data archiving needs for future publications. Related to archiving data to support the principles of Open Science and Reproducible Research is a newly funded BD2K effort, called bio-CADDIE (https://biocaddie.org), to develop a comprehensive set of descriptors of data sets to support the search and discovery of available sharable data resources.

Safran recently summarized the value of reuse of clinical data made available by EHRs, the potential problems with large aggregations of these data that do not necessarily have consistent meanings, the policy frameworks that have been formulated, and the major challenges in the coming years [ 80 ]. More recently, Hersh and colleagues expanded these concerns in the context of more recent large scale comparative effectiveness research networks [ 81 ].

Understanding the ethnical implications of Big Data lags behind [ 82 ] and the existing regulatory framework falls short to meet the needs of the evolving data capabilities. Mittelstadt et al. identified five key areas of concerns [ 82 ]: 1) informed consent, (2) privacy, (3) ownership, (4) epistemology and objectivity, and (5) ‘Big Data Divides’ created between those who have or lack the necessary resources to analyze increasingly large datasets. Data breach is still a significant threat to organizations and individuals curating, using, and sharing these data. The imperative for protecting patient privacy and data confidentiality requires advanced network security safeguards and enhanced patient privacy and data confidentiality protection. The conversation about privacy has shifted away from ensuring privacy to assessing risk instead. It is no longer possible to guarantee privacy. It is only possible to estimate and manage risk.

Six additional areas of concern were suggested to require much closer scrutiny in the immediate future: (6) the dangers of ignoring group-level ethical harms; (7) the importance of epistemology in assessing the ethics of Big Data; (8) the changing nature of fiduciary relationships that become increasingly data saturated; (9) the need to distinguish between ‘academic’ and ‘commercial’ Big Data practices in terms of potential harm to data subjects; (10) future problems with ownership of intellectual property generated from analysis of aggregated datasets; and (11) the difficulty of providing meaningful access rights to individual data subjects that lack necessary resources. For this last theme, data access by non-technical stakeholders such as clinical researchers, studies have found that data query mediation is a laborious and error-prone process and has not received adequate attention but can negatively affect research reliability for studies based on these data [ 83-85 ]. As Mittelstadt et al. pointed out, “these themes provide a thorough critical framework to guide ethical assessment and governance of emerging Big Data practices.” New studies have shed light on borrowing ideas from library and information sciences or dialogue system research to improve query mediations for biomedical Big Data [ 86 , 87 ].

Meanwhile, continued progress on data interoperability and clinical research regulations has reached a new milestone this year. Richesson and Chute published a special issue for JAMIA on data interoperability standards and concluded that “data standards are finally down to business for enabling emerging interoperability” [ 88 ]. The Notice of Proposed Rulemaking (NPRM) for regulating clinical research was released this September to enhance protection of research participants while streamlining IRB review efficiency [ 89 ].

The above progress together with the technology readiness well prepare the CRI community to engage and lead in the newly launched initiative for advancing Precision Medicine in the USA, which has an urgent need for Big Data and large representative population samples. Genetic variants discovered from small and unrepresentative population samples may mislead the public ( http://www.theatlantic.com/science/archive/2015/09/genome-big-data-disease-genes/404356/ ). In order to help verify genetic discoveries, libraries of genetic variants have been developed. ClinVar was created to address the need for transparency in genetic evidence [ 90 , 91 ]. Food and Drug Administration has also launched OpenFDA (https://open.fda.gov/) and PrecisionFDA (https://precision.fda.gov) to help improve the transparency and collaboration in safety data around drugs and devices.

The newly released strategic plan of the National Institutes of Health of the United States [ 92 ] also highlights the imperative for developing the “science of science” and “evaluating steps to enhance rigor and reproducibility”. It is foreseeable even future funding decisions will be based on data-driven evidence of research quality and impact.

4 Data Quality Challenges

Unlike “traditional” prospective clinical trials that utilize detailed data collection tools and procedures and rely on trained data collection personnel, EHR and PHR databases contain data collected during routine clinical care by practitioners focused on patient care or by patients focused on capturing their health care experiences rather than research. Differences in clinical workflows, practice standards, patient populations, available technologies, and referral resources impact what data are collected and how they are documented. Numerous studies have highlighted significant concerns about the quality of data in EHRs [ 34 , 93-100 ]. CER studies seek to exploit real-world diversity in order to detect and understand determinants impacting outcome variation. Data quality and completeness problems, however, may affect the validity of CER findings [ 101 , 102 ]. The importance of good quality data in clinical research is well accepted [ 103 , 104 ]. There are substantial efforts to develop robust analytic methods for extracting valid knowledge from observational data, but there are no formal data quality assessment guidelines, analytic methods, or reporting requirements. Methods for categorizing, analyzing, and reporting on data quality, however, are poorly developed. Most approaches to data quality (DQ) assessment are ad hoc, developed based on an intuitive understanding of data quality challenges, and focused on specific research questions [ 105-108 ]. Few systematic approaches to DQ assessment for the secondary use of clinically-obtained data have been proposed. Current methods do not emphasize the need to improve the reporting of DQ results [ 109 ].

In the four years since the publication of our initial conceptual model, clinical research informatics continues to evolve and expand. Our previous work emphasized tools that support clinical research workflow and new clinical research data networks. A similar review of the CRI landscape by Embi and Payne described six core CRI activities: (1) data capture, collection and re-use, (2) standards, (3) tensions with regulatory and ethical issues, (4) research networks and team science, (5) improved user experiences, and (6) integration of clinical research and practice [ 110 ]. While these efforts have continued over the intervening period, a clear shift toward a more data-centric perspective permeates this update. A widening array of data sources, data sharing methods, and Big Data architectures, tools and analytics are dominating the current CRI agenda. Tied closely to this shift is the rapid development of large-scale data sharing networks and new distributed query and analytics infrastructures, including the appearance of a new common data model from PCORI [ 38 , 111 , 112 ]. New infrastructure and methods for record linkage, data fusion, natural language processing (NLP), and standardized phenotyping have enabled new data discovery opportunities that were being discussed but not widely implemented during our previous CRI overview [ 6 ].

As CRI investigators implement these expansive data resources and develop new tools for linking, exploring, visualizing and analyzing complex data sets, how will these data be used to accelerate translational research and new discoveries? “Traditional” uses include retrospective clinical research, study feasibility, and cohort selection or patient recruitment. New data sources also enable new capabilities, including the development of “deep clinical phenotypes” that include the use of biomarkers, imaging results, and NLP to extract clinical features not available from typical databases based on “coded” data elements [ 113-116 ]. Data linkages that combine clinical and billing data allow analysis of longitudinal outcomes; linkages with environmental exposures adds new dimensions to determining disease risks across broad patient populations [ 113 , 116 ]. The inclusion of diverse clinical practices allows assessment of the relationship between health system features on disease diagnosis, treatment patterns, and outcomes [ 117 ]. While these types of studies have been performed by investigators for many years, the new data infrastructures hold the promise of dramatically reducing the cost and effort required to do similar studies at population sizes and in diverse practice settings that were not previously available or affordable [ 49 , 118 ].

New opportunities also bring new challenges. We have noted the lack of clarity around the ethical use of large-scale, linked data, the growing gap between the regulatory restrictions and the ability to maintain patient privacy, the need to promote patient engagement in complex data sciences programs, the need to better understand the impact of data quality and biases across various data sources, and the lack of competent infrastructure to fully support the principles of Open Science / Reproducible Research. We have raised concerns about the need to improve transparency in the use of large-scale data sets and the analytical discoveries derived from them, especially in validating disease risks and predicted outcomes for both highly refined populations and individuals. Evidence of the profound negative consequences of not doing this well is beginning to appear as publications of false positives in genomic discoveries or chilling anecdotes in the misuse of genetic risk information [ 11-13 , 119-122 ]. Guidelines for developing robust risk models do exist and should be adapted and incorporated into the analytics platforms that CRI investigators create [ 123 ]. Furthermore, the long-term financial sustainability of large-scale data networks and the associated administrative, regulatory, and technical infrastructure costs has yet to be demonstrated. While not entirely under the control of CRI investigators, the CRI community must continue to seek novel value-based approaches to developing tools and infrastructures that have high, recognized value to organizations that would be willing to contribute to the financial stability of these significant investments. Each of these challenges represents a new area for CRI investigators to both lead and contribute novel methodologies and tools to support evolving data governance and regulatory frameworks.

Four years ago, we published a conceptual model for Clinical Research Informatics that highlighted the importance of data sources, research workflows, and underlying core technologies. Our current update highlights the growth of the diversity and size of data resources and expands the underlying core technologies to include more data-sciences centered activities. As the predictive capabilities of Big Data Analytics becomes more precise, CRI, in partnership with colleagues in biostatistics, research ethics, patient empowerment, and community engagement, will need to include patients and policy makers in difficult conversations about validating and communicating the findings of these predictive models. Also high on our priority list is a significant investment in developing new incentives and methods for promoting data sharing while protecting privacy and confidentiality, including analytic methods to create a true reproducible research / open science culture. With the rise of the “citizen scientist” [ 124 ], “quantified self ” [ 125 ], and engaged patients as research partners and co-investigators, the timing is right for engaging and empowering all these stakeholders and communities in establishing how best to leverage these new opportunities to generate robust medical evidence faster than ever.

Acknowledgements

This study was funded by National Library of Medicine grant R01LM009886 (PI: Weng) and Patient Centered Outcomes Research Institute award ME-1303-5581 (PI: Kahn).

Conflict of Interest Notification

CW declares no conflict of interest. MGK is a member of the external advisory board for TriNetX Corporation who provides data tools for clinical trial design and recruitment.

Data Quality in Clinical Research

  • First Online: 15 June 2023

Cite this chapter

Book cover

  • Meredith Nahm Zozus 4 ,
  • Michael G. Kahn 5 &
  • Nicole G. Weiskopf 6  

Part of the book series: Health Informatics ((HI))

688 Accesses

1 Citations

Every scientist knows that research results are only as good as the data upon which the conclusions were formed. However, most scientists receive no training in methods for achieving, assessing, or controlling the quality of research data—topics central to clinical research informatics. This chapter covers the basics of acquiring or collecting and processing data for research given the available data sources, systems, and people. Data quality dimensions specific to the clinical research context are used, and a framework for data quality practice and planning is presented. Available research is summarized, providing estimates of data quality capability for common clinical research data collection and processing methods. This chapter provides researchers, informaticists, and clinical research data managers basic tools to assure, assess, and control the quality of data for research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Davis JR, Nolan VP, Woodcock J, Estabrook EW. Assuring data quality and validity in clinical trials for regulatory decision making, Institute of Medicine Workshop report. Roundtable on research and development of drugs, biologics, and medical devices. Washington, DC: National Academy Press; 1999. http://books.nap.edu/openbook.php?record_id=9623&page=R1 . Accessed 6 July 2009

Google Scholar  

Deming WE, Geoffrey L. On sample inspection in the processing of census returns. J Am Stat Assoc. 1941;36:351–60.

Article   Google Scholar  

Deming WE, Tepping BJ, Geoffrey L. Errors in card punching. J Am Stat Assoc. 1942;37:525–36.

Donabedian A. A guide to medical care administration, Medical care appraisal – quality and utilization, vol. 2. New York: American Public Health Association; 1969. p. 176.

Arndt S, Tyrell G, Woolson RF, Flaum M, Andreasen NC. Effects of errors in a multicenter medical study: preventing misinterpreted data. J Psychiatr Res. 1994;28:447–59.

Article   CAS   PubMed   Google Scholar  

Lee YW, Pipino LL, Wang RY, Funk JD. Journey to data quality. Reprint ed. Cambridge, MA: MIT Press; 2009.

Weber GM, Mandl KD, Kohane IS. Finding the missing link for big biomedical data. JAMA. 2014;311(24):2479–80.

CAS   PubMed   Google Scholar  

Steinhubl SR, Muse ED, Topol EJ. The emerging field of mobile health. Sci Transl Med. 2015;7(283):283rv3.

Article   PubMed   PubMed Central   Google Scholar  

Friedman CP. A “fundamental theorem” of biomedical informatics. J Am Med Inform Assoc. 2009;16(2):169–70. https://doi.org/10.1197/jamia.M3092 . Epub 2008 Dec 11

DeLone WH, McLean ER. The DeLone and McLean model of information systems success: a ten-year update. J Manag Inf Syst. 2003;19(4):9–30.

United States Department of Health and Human Services (HHS), E6(R2) Good Clinical Practice: Integrated Addendum to ICH E6(R1) Guidance for Industry, OMB Control No. 0910–0843 March 2018. https://www.fda.gov/downloads/Drugs/Guidances/UCM464506.pdf

International Organization for Standardization (ISO). Data quality – Part 2: Vocabulary ISO 8000-2:2017.

Reprinted with permission from Data Gone Awry, DataBasics, vol 13, no 3, Fall. 2007. Society for Clinical Data Management. http://www.scdm.org .

Nagurney JT, Brown DF, Sane S, Weiner JB, Wang AC, Chang Y. The accuracy and completeness of data collected by prospective and retrospective methods. Acad Emerg Med. 2005;12:884–95.

Article   PubMed   Google Scholar  

Feinstein AR, Pritchett JA, Schimpff CR. The epidemiology of cancer therapy. 3. The management of imperfect data. Arch Intern Med. 1969;123:448–61.

Reason J. Human error. Cambridge: Cambridge University Press; 1990.

Book   Google Scholar  

Nahm M, Dziem G, Fendt K, Freeman L, Masi J, Ponce Z. Data quality survey results. Data Basics. 2004;10:7.

Schuyl ML, Engel T. A review of the source document verification process in clinical trials. Drug Info J. 1999;33:789–97.

Batini C, Catarci T, Scannapieco M. A survey of data quality issues in cooperative information systems. In: 23rd international conference on conceptual modeling (ER 2004), Shanghai; 2004.

Tayi GK, Ballou DP. Examining data quality. Commun ACM. 1998;41:4.

Redman TC. Data quality for the information age. Boston, MA: Artech House; 1996.

Wand Y, Wang R. Anchoring data quality dimensions in ontological foundations. Commun ACM. 1996;39:10.

Wang R, Strong D. Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst. 1996;12:30.

Batini C, Scannapieco M. Data quality concepts, methodologies and techniques. Berlin: Springer; 2006.

Wyatt J. Acquisition and use of clinical data for audit and research. J Eval Clin Pract. 1995;1:15–27.

U.S. Food and Drug Administration. Guidance for industry. Computerized systems used in clinical trials. Rockville, MD: U.S. Food and Drug Administration; 2007.

Arts DG, De Keizer NF, Scheffer GJ. Defining and improving data quality in medical registries: a literature review, case study, and generic framework. J Am Med Inform Assoc. 2002;9:600–11.

Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc. 2013;20:144–51. https://doi.org/10.1136/amiajnl-2011-000681 .

Woollen SW. Data quality and the origin of ALCOA. The Compass: Newsletter of the Southern Regional Chapter Society or Quality Assurance 2010 (Summer).

GCP Inspectors Working Group European Medicines Agency (EMA). Reflection paper on expectations for electronic source data and data transcribed to electronic data collection tools in clinical trials. EMA/INS/GCP 454280/2010, 9 June 2010.

Wilkinson MD, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018.

Kahn MG, Callahan TJ, Barnard J, Bauck AE, Brown J, Davidson BN, Estiri H, Goerg C, Holve E, Johnson SG, Liaw S-T, Hamilton-Lopez M, Meeker D, Ong TC, Ryan P, Shang N, Weiskopf NG, Weng C, Zozus MN, Schilling L. A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. eGEMs. 2016;4(1):1244. Sep 11 [cited 2016 Sep 12]. http://repository.edm-forum.org/egems/vol4/iss1/18

Callahan TJ, Bauck AE, Bertoch D, Brown J, Khare R, Ryan PB, Staab J, Zozus MN, Kahn MG. A comparison of data quality assessment checks in six data sharing networks. eGEMs. 2017;5(1):8. Jun 12 [cited 2017 Jun 15]. http://repository.edm-forum.org/egems/vol5/iss1/8

Estiri H, Stephens K. DQe-v: a database-agnostic framework for exploring variability in electronic health record data across time and site location. eGEMs. 2017;5(1):3. May 10 [cited 2017 Jul 30]. http://repository.edm-forum.org/egems/vol5/iss1/3

Kahn MG, Brown JS, Chun AT, Davidson BN, Meeker D, Ryan PB, Schilling LM, Weiskopf NG, Williams AE, Zozus MN. Transparent reporting of data quality in distributed data networks. eGEMs. 2015;3(1):7. https://doi.org/10.13063/2327-9214.1052 . http://repository.academyhealth.org/egems/vol3/iss1/7

Zozus MN, Lazarov A, Smith L, Breen T, Krikorian S, Zbyszewski P, Knoll K, Jendrasek D, Perrin D, Zambas D, Williams T, Pieper C. Analysis of professional competencies for the clinical research data management profession: implications for training and professional certification. JAMIA. 2017;24:737–45.

PubMed   PubMed Central   Google Scholar  

Yamaguchi T, Miyaji T, Hayashi Y, Suganami H. Clinical data management in Japan: past, present, and future. J Soc Clin Data Manag. 2021;1(3). https://doi.org/10.47912/jscdm.45 .

Houston L, Probst Y. Clinical data management: a review of current practice in Australia. J Soc Clin Data Manag. 2021;1(3):2. https://doi.org/10.47912/jscdm.62 .

Ittenbach RF. Practice of clinical data management worldwide: introduction to the special issue J Soc Clin Data Manag. 2021; 1(3). https://doi.org/10.47912/jscdm.146 .

Glushakov S, Boichuk V. The untapped potential of clinical data management in Ukraine: a novel training program case study. J Soc Clin Data Manag. 2022;1(3). https://doi.org/10.47912/jscdm.43 .

Banach MA, Fendt KH, Proeve J, Plummer D, Qureshi S, Limaye N. Clinical data management in the United States: where we have been and where we are going. J Soc Clin Data Manag. 2022;1(3). https://doi.org/10.47912/jscdm.61 .

Aboulelenein S, et al. Analysis of professional competencies for the clinical research data management profession. Data Basics. 2020;26(1):6–17.

Zozus MN, et al. Analysis of professional competencies for the clinical research data management profession: implications for training and professional certification. J Am Med Inform Assoc. 2017;24(4):737–45.

(CDISC) CDISC. The protocol representation model version 1.0 draft for public comment: CDISC; 2009. p. 96. http://www.cdisc.org

Jacobs M, Studer L. Forms design II: the course for paper and electronic forms. Cleveland: Ameritype & Art; 1991.

Eisenstein EL, Lemons PW, Tardiff BE, Schulman KA, Jolly MK, Califf RM. Reducing the costs of phase III cardiovascular clinical trials. Am Heart J. 2005;9:482–8.

Eisenstein EL, Collins R, Cracknell BS, et al. Sensible approaches for reducing clinical trial costs. Clin Trials. 2008;5:75–84.

Galešic M. Effects of questionnaire length on response rates: review of findings and guidelines for future research. 2002. http://mrav.ffzg.hr/mirta/Galesic_handout_GOR2002.pdf . Accessed 29 Dec 2009.

Roszkowski MJ, Bean AG. Believe it or not! Longer questionnaires have lower response rates. J Bus Psychol. 1990;4:495–509.

Edwards P, Roberts I, Clarke M, DiGuiseppi C, Pratap S, Wentz R, Kwan I. Increasing response rates to postal questionnaires systematic review. Br Med J. 2002;324:1183.

Wickens CD, Hollands JG, Parasuraman R. Engineering psychology and human performance. 4th ed. New York: Routledge; 2016.

Stevens SS. On the theory of scales of measurement. Science. 1946;103:677–80.

Allison JJ, Wall TC, Spettell CM, et al. The art and science of chart review. Jt Comm J Qual Improv. 2000;26:115–36.

Banks NJ. Designing medical record abstraction forms. Int J Qual Health Care. 1998;10:163–7.

Engel L, Henderson C, Fergenbaum J, Interrater A. Reliability of abstracting medical-related information medical record review conduction model for improving. Eval Health Prof. 2009;32:281.

Cunningham R, Sarfati D, Hill S, Kenwright D. An audit of colon cancer data on the New Zealand cancer registry. N Z Med J. 2008;121(1279):46–56.

PubMed   Google Scholar  

Fritz A. The SEER program’s commitment to data quality. J Registry Manag. 2001;28(1):35–40.

German RR, Wike JM, Wolf HJ, et al. Quality of cancer registry data: findings from CDC-NPCR’s breast, colon, and prostate cancer data quality and patterns of care study. J Registry Manag. 2008;35(2):67–74.

Herrmann N, Cayten CG, Senior J, Staroscik R, Walsh S, Woll M. Interobserver and intraobserver reliability in the collection of emergency medical services data. Health Serv Res. 1980;15(2):127–43.

CAS   PubMed   PubMed Central   Google Scholar  

Pan L, Fergusson D, Schweitzer I, Hebert PC. Ensuring high accuracy of data abstracted from patient charts: the use of a standardized medical record as a training tool. J Clin Epidemiol. 2005;58(9):918–23.

Reeves MJ, Mullard AJ, Wehner S. Inter-rater reliability of data elements from a prototype of the Paul Coverdell National Acute Stroke Registry. BMC Neurol. 2008;8:19.

Scherer R, Zhu Q, Langenberg P, Feldon S, Kelman S, Dickersin K. Comparison of information obtained by operative note abstraction with that recorded on a standardized data collection form. Surgery. 2003;133(3):324–30.

Stange KC, Zyzanski SJ, Smith TF, et al. How valid are medical records and patient questionnaires for physician profiling and health services research? A comparison with direct observation of patients visits. Med Care. 1998;36(6):851–67.

Thoburn KK, German RR, Lewis M, Nichols PJ, Ahmed F, Jackson-Thompson J. Case completeness and data accuracy in the centers for disease control and prevention’s national program of cancer registries. Cancer. 2007;109(8):1607–16.

To T, Estrabillo E, Wang C, Cicutto L. Examining intra-rater and inter-rater response agreement: a medical chart abstraction study of a community-based asthma care program. BMC Med Res Methodol. 2008;8:29.

Yawn BP, Wollan P. Interrater reliability: completing the methods description in medical records review studies. Am J Epidemiol. 2005;161(10):974–7.

La France BH, Heisel AD, Beatty MJ. A test of the cognitive load hypothesis: investigating the impact of number of nonverbal cues coded and length of coding session on observer accuracy. Commun Rep. 2007;20:11–23.

Zozus MN. The data book: collection and management of research data. Taylor & Francis/CRC Press Catalog #: K26788, ISBN: 978-1-4987-4224-5.

Helms R. Redundancy: an important data forms/design data collection principle. In: Proceedings Stat computing section, Alexandria; 1981. p. 233–237.

Helms R. Data quality issues in electronic data capture. Drug Inf J. 2001;35:827–37.

U.S. Food and Drug Administration regulations. Title 21 CFR Part 58. 2011. http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/cfrsearch.cfm?cfrpart=58 . Accessed Aug 2011.

Nahm ML, Pieper CF, Cunningham MM. Quantifying data quality for clinical trials using electronic data capture. PLoS One. 2008;3(8):e3049.

Winchell T. The mystery of source documentation. SOCRA Source 62. 2009. http://www.socra.org /

Nahm M. Data accuracy in medical record abstraction. Doctoral Dissertation, University of Texas at Houston, School of Biomedical Informatics, Houston, May 6, 2010.

Zozus MN, Pieper C, Johnson CM, Johnson TR, Franklin A, Smith J, et al. Factors affecting accuracy of data abstracted from medical records. PLoS One. 2015;10(10):e0138649.

Zozus MN, editor. Good clinical data management practices (GCDMP). 2022. JSCDM.org , Society for Clinical Data Management. https://www.jscdm.org/undertheGCDMPmenutab

Rostami R, Nahm M, Pieper CF. What can we learn from a decade of database audits? The Duke Clinical Research Institute experience, 1997–2006. Clin Trials. 2009;6(2):141–50.

Stellman SD. The case of the missing eights an object lesson in data quality assurance. Am J Epidemiol. 1989;129(4):857–60. https://doi.org/10.1093/oxfordjournals.aje.a115200 .

Hogan WR, Wagner MM. Accuracy of data in computer-based patient records. J Am Med Inform Assoc. 1997;4(5):342–55.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Thiru K, Hassey A, Sullivan F. Systematic review of scope and quality of electronic patient record data in primary care. BMJ. 2003;326(7398):1070. Review

Chan KS, Fowles JB, Weiner JP. Review: electronic health records and the reliability and validity of quality measures: a review of the literature. Med Care Res Rev. 2010;67(5):503–27. https://doi.org/10.1177/1077558709359007 .

Observational Health Data Sciences and Informatics. OHDSI Observational Medical Outcomes Partnership (OMOP) Common Data Model. https://www.ohdsi.org/ . Accessed 29 May 2018.

The National Patient-Centered Clinical Research Network (PCORnet). Common data model v3.0. https://pcornetcommons.org/resource_item/pcornet-common-data-model-cdm-specification-version-3-0/ . Accessed 1 Feb 2016.

Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med Care. 2012;50(Suppl):S21–9. https://doi.org/10.1097/MLR.0b013e318257dd67 .

Weiskopf NG, Hripcsak G, Swaminathan S, Weng C. Defining and measuring completeness of electronic health records for secondary use. J Biomed Inform. 2013;46:830–6. https://doi.org/10.1016/j.jbi.2013.06.010 .

Rubin DB. Inference and missing data. Biometrika. 1976;63(3):581–92.

Svolba G, Bauer P. Statistical quality control in clinical trials. Control Clin Trials. 1999;20(6):519–30.

Chilappagari S, Kulkarni A, Bolick-Aldrich S, Huang Y, Aldrich TE. A statistical process control method to monitor completeness of central cancer registry reporting data. J Registry Manag. 2002;29(4):121–7.

Chiu D, Guillaud M, Cox D, Follen M, MacAulay C. Quality assurance system using statistical process control: an implementation for image cytometry. Cell Oncol. 2004;26(3):101–17.

McNees P, Dow KH, Loerzel VW. Application of the CuSum technique to evaluate changes in recruitment strategies. Nurs Res. 2005;54(6):399–405.

Baigent C, Harrell FE, Buyse M, Emberson JR, Altman DG. Ensuring trial validity by data quality assurance and diversification of monitoring methods. Clin Trials. 2008;5(1):49–55.

Matheny ME, Morrow DA, Ohno-Machado L, Cannon CP, Sabatine MS, Resnic FS. Validation of an automated safety surveillance system with prospective, randomized trial data. Med Decis Mak. 2009;29(2):247–56.

McGilvray D. Executing data quality projects: ten steps to quality data and trusted information. 1st ed. Amsterdam: Morgan Kaufmann; 2008. p. 352.

Ladley J. Data governance: how to design, deploy and sustain an effective data governance program. 1st ed. Waltham: Morgan Kaufmann; 2012. p. 264.

Loshin D. The practitioner’s guide to data quality improvement. 1st ed. Burlington: Morgan Kaufmann; 2010. p. 432.

Baskarada S. IQM-CMM: information quality management capability maturity model. Germany: Vieweg and Teubner; 2010.

Capability Maturity Model Integration (CMMITM) Institute, Data Management maturity model, CMMI Institute; 2014.

Stanford University. Stanford data governance maturity model. http://web.stanford.edu/dept/pres-provost/irds/dg/files/StanfordDataGovernanceMaturityModel.pdf . Accessed 12 May 2018

Williams M, Bagwell J, Zozus M. Data management plans, the missing perspective. J Biomed Inform. 2017;71:130–42.

Freedman LS, Schatzkin A, Wax Y. The impact of dietary measurement error on planning sample size required in a cohort study. Am J Epidemiol. 1990;132:1185–95.

Perkins DO, Wyatt RJ, Bartko JJ. Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials. Biol Psychiatry. 2007;47:762–6.

Mullooly JP. The effects of data entry error: an analysis of partial verification. Comput Biomed Res. 1990;23:259–67.

Liu K. Measurement error and its impact on partial correlation and multiple linear regression analyses. Am J Epidemiol. 1988;127:864–74.

Stepnowsky CJ Jr, Berry C, Dimsdale JE. The effect of measurement unreliability on sleep and respiratory variables. Sleep. 2004;27:990–5.

Myer L, Morroni C, Link BG. Impact of measurement error in the study of sexually transmitted infections. Sex Transm Infect. 2004;80(318–323):328.

Williams SC, Watt A, Schmaltz SP, Koss RG, Loeb JM. Assessing the reliability of standardized performance indicators. Int J Qual Health Care. 2006;18:246–55.

Watt A, Williams S, Lee K, Robertson J, Koss RG, Loeb JM. Keen eye on core measures. Joint commission data quality study offers insights into data collection, abstracting processes. J AHIMA. 2003;74:20–5. quiz 27–8

US Government Accountability Office. Hospital quality data: CMS needs more rigorous methods to ensure reliability of publicly released data. In: Office UGA, editor. Washington, DC; 2006. www.gao.gov/new.items/d0654.pdf

Braun BI, Kritchevsky SB, Kusek L, et al. Comparing bloodstream infection rates: the effect of indicator specifications in the evaluation of processes and indicators in infection control (EPIC) study. Infect Control Hosp Epidemiol. 2006;27:14–22.

Jacobs R, Goddard M, Smith PC. How robust are hospital ranks based on composite performance measures? Med Care. 2005;43:1177–84.

Pagel C, Gallivan S. Exploring consequences on mortality estimates of errors in clinical databases. IMA J Manag Math. 2008;20(4):385–93. http://imaman.oxfordjournals.org/content/20/4/385.abstract

Goldhill DR, Sumner A. APACHE II, data accuracy and outcome prediction. Anaesthesia. 1998;53:937–43.

Download references

Author information

Authors and affiliations.

University of Texas Health Science Center at San Antonio, San Antonio, TX, USA

Meredith Nahm Zozus

Department of Pediatrics and the Colorado Clinical and Translational Sciences Institute, University of Colorado Anschutz Medical Campus, Aurora, CO, USA

Michael G. Kahn

Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, Portland, OR, USA

Nicole G. Weiskopf

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Meredith Nahm Zozus .

Editor information

Editors and affiliations.

Learning Health Sciences, University of Michigan School of Medicin, Ann Arbor, MI, USA

Rachel L. Richesson

School of Information, University of South Florida, Tampa, FL, USA

James E. Andrews

Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA

Kate Fultz Hollis

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Zozus, M.N., Kahn, M.G., Weiskopf, N.G. (2023). Data Quality in Clinical Research. In: Richesson, R.L., Andrews, J.E., Fultz Hollis, K. (eds) Clinical Research Informatics. Health Informatics. Springer, Cham. https://doi.org/10.1007/978-3-031-27173-1_10

Download citation

DOI : https://doi.org/10.1007/978-3-031-27173-1_10

Published : 15 June 2023

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-27172-4

Online ISBN : 978-3-031-27173-1

eBook Packages : Medicine Medicine (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. Clinical Research Informatics

    clinical research informatics examples

  2. Infographic: The Rise of Clinical Informatics

    clinical research informatics examples

  3. The New Healthcare Reality: Clinical Information Systems

    clinical research informatics examples

  4. (PDF) State of the Art in Clinical Informatics: Evidence and Examples

    clinical research informatics examples

  5. (PDF) Review of Clinical Research Informatics

    clinical research informatics examples

  6. (PDF) Clinical Research Informatics

    clinical research informatics examples

VIDEO

  1. 1-3- Types of Clinical Research

  2. Investigators Brochure(IB)

  3. Panel Discussion on Building digital public infrastructure for national health database at #17IDS

  4. 2023 July Community Meeting

  5. How to search or find the clinical trials for research

  6. Clinical Research Informatics: Step by Step Part 02

COMMENTS

  1. Review of Clinical Research Informatics

    "Clinical Research Informatics involves the use of informatics in the discovery and management of new knowledge relating to health and disease. It includes management of information related to clinical trials and also involves informatics related to secondary research use of clinical data. ... Examples include clinical oncology 113, neurology ...

  2. Clinical research informatics: a conceptual perspective

    Clinical research informatics (CRI) is the rapidly evolving sub-discipline within biomedical informatics that focuses on developing new informatics theories, tools, and solutions to accelerate the full translational continuum 1 2: basic research to clinical trials (T1), clinical trials to academic health center practice (T2), diffusion and implementation to community practice (T3), and 'real ...

  3. Clinical Research Informatics

    Clinical Research Informatics presents a detailed review of using informatics in the continually evolving clinical research environment. It represents a valuable textbook reference for all students and practising healthcare informaticians looking to learn and expand their understanding of this fast-moving and increasingly important discipline.

  4. PDF Introduction to Clinical Research Informatics 1

    Clinical research informatics (CRI) is the application of informatics principles and techniques to support the spectrum of activities and business processes that instantiate clinical research. Informatics, as somewhat broadly defined as the intersection of information and computer science with a health-related discipline,

  5. Introduction to Clinical Research Informatics

    Clinical research has been characterized as a discipline resting on three pillars of principle and practice related to control, mensuration, and analysis [ 2 ], though these can be more modernly interpreted as a triad of expertise in medicine, statistics, and logistics [ 3 ]. Clinical research informatics (CRI) is the application of informatics ...

  6. Clinical Research Informatics

    For example, a clinical research informatics group may implement an i2b2 data mart, but does not particularly care about inventory and purchasing. Also, while they may collaborate with hospital IT, the groups don't necessarily report to the same administrative authority. (In many institutions, the university, which typically pays the salaries ...

  7. Clinical Research Informatics

    Objectives: To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select best papers published in 2019.. Method: A bibliographic search using a combination of MeSH descriptors and free-text terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed ...

  8. Clinical Research Informatics

    Dr. Richesson is a Professor of Informatics at the University of Michigan School of Medicine, Department of Learning Health Sciences. She works with a number of different clinical research networks and pragmatic clinical trials, and supports the development and use of data standards. Dr. Andrews is an Associate Professor of Informatics in the ...

  9. Clinical research informatics

    Clinical research informatics (CRI) is a subdomain of biomedical and health informatics that focuses on the application of informatics to the discovery and management of new knowledge relating to health and disease. It includes management of information related to clinical trials, and also involves informatics related to secondary research use of clinical data.

  10. Appreciating the Distinction: Clinical Informatics Research vs

    Clinical Research Informatics. Clinical Research Informatics is centered around leveraging informatics methodologies to enhance the research processes by introducing new paradigms for discovery and knowledge management. This field aims to innovate how data are harnessed to characterize, predict, prevent, diagnose, and treat disease more ...

  11. Clinical Research Informatics

    The Clinical Research Informatics Working Group's mission is to advance the discipline of Clinical Research Informatics (CRI) by fostering interaction, discussion and collaboration among individuals and groups involved or interested in the practice and study of CRI, and to serve as the home for CRI professionals within AMIA. WG Webinar Library.

  12. What is Clinical Informatics?

    Clinical informatics is the study and practice of structuring medical and healthcare information across a range of platforms and settings. ... management and planning. For example, when a patient goes for ... so research is necessary. Educational Requirements for Clinical Informatics Jobs. This field of study is most often pursued by healthcare ...

  13. Clinical Research Informatics

    A lack of sufficient information technology (IT) and biomedical informatics tools and platforms, as well as relevant expertise and methodological frameworks, account for significant impediments to the rapid, effective, and resource-efficient conduct of clinical research projects (Payne et al. 2010; Payne et al. 2005; Payne et al. 2013).Compounding these challenges is the rapid pace of ...

  14. Clinical Research Informatics

    1 Introduction. For the 2021 International Medical Informatics Association (IMIA) Yearbook, the goal of the Clinical Research Informatics (CRI) section is to provide an overview of research trends from 2021 publications that demonstrate the progress in multifaceted aspects of medical informatics supporting research and innovation in the ...

  15. Biomedical Informatics & Data Science

    Biomedical Informatics & Data Science engages new and existing faculty, students, and staff from all of Yale to promote equitable and sustainable health with informatics and data science. We do top-notch research, linking across disciplines, and delivering in practical settings. We promote excellence in transdisciplinary training.

  16. Clinical Research Informatics: Challenges, Opportunities and Definition

    "A broad issue is the international nature of clinical research, especially clinical trials; for example, clinical trials in developing countries where the informatics infrastructure and the regulatory and ethical oversight are sometimes not as well developed—trials having to respond to a patchwork of national, regional, and international ...

  17. Clinical Research Informatics

    Clinical Research Informatics (CRI) ... For example, research efforts are expanding beyond the traditional environments of single academic medical centers to multi-center, community-based and global locations of research. While there are a variety of reasons for this, cost-effectiveness and efficiency are often cited among them. ...

  18. Clinical Information Systems

    The Department of Clinical Research Informatics (DCRI) provides technical, interface, and project management support for numerous clinical and clinical information systems department applications throughout the NIH campus. Major Clinical Information Systems. Department of Perioperative Medicine. Perioperative Services Information System (POIS/SIS)

  19. Clinical Research Informatics for Big Data and Precision Medicine

    Introduction. Clinical Research Informatics (CRI), a recently defined subfield of biomedical informatics that focuses on informatics support for medical evidence generation [], has continued to enlarge its scope and importance in supporting the broadening agendas in clinical and translational sciences [].Over the past decade, accelerating at an explosive pace, biomedical research has moved ...

  20. Data Quality in Clinical Research

    1. Describe the importance of data quality in clinical research and its impact on results and conclusions. 2. Define the linkages between data quality and informatics, and the implications for research and practice. 3. Explore, through sample scenarios, different situations where data quality problems might occur. 4.