Blog

Natural Language Processing in Healthcare

Author Image
The Centaur Blogging Team
August 19, 2022

The volume of medical text data, especially digitized and unstructured narrative text, has taken off with the adoption of EHRs and other digital tools used by clinicians, researchers, and patients. All of this medical text is fueling a surge of NLP model development to turn those medical text datasets into insights that impact care delivery, biomedical research, and much more. As every healthcare, medical device, and pharmaceutical company deepens their investments in AI, we’re seeing a growing number of clients developing NLP and an increased need for medical text annotations.

What is NLP, and how is it being applied in healthcare?

Natural Language Processing (NLP) is a branch of artificial intelligence that enables computers to understand, interpret, and generate human language. It combines computational linguistics with machine learning and deep learning models to process and analyze large amounts of natural language data—such as text or speech. NLP is being applied in most parts of the healthcare value chain.

From accelerating subject recruitment in clinical trials to supporting decisions and diagnoses at the point of care to helping consumers find the accurate health information they need, there’s a lot of impact to be excited about, and we’ll give many detailed examples in a later post in this series.

What is medical text?

At Centaur Labs we use ‘medical text’ as shorthand for multiple types of health-related data stored as text. Medical text encompasses 3 subcategories of text: clinical text, biomedical text, and ‘other’ health text. 

Clinical text

Clinical text is defined as text collected during the course of ongoing patient care in the formal healthcare system or as part of a formal clinical trial program. We often think of clinical text as written by clinicians themselves, for example, when making clinical notes in the electronic health record (EHR), in a prescription write-up, or in a pathology report.

However, patients also generate an increasing volume of clinical text. They may describe the beginning of their healthcare journey, i.e., a discussion with a chatbot automating part of patient intake or a typical in-person clinical visit with a therapist. Patients may also describe the middle or end of their healthcare journey, i.e., a transcript of a follow-up call with a clinician, an insurance claim, or a discharge description.

Clinical text can also be written by others in the healthcare ecosystem who participate in patient care, such as insurance claims processors or family members. 

Biomedical text

Biomedical text is any data stored as text collected or created throughout the course of medical research. Often this text is written by research scientists in the form of scientific papers published in academic journals or as unpublished intellectual property. This unpublished text is owned and managed by the academic institution or company that employed the scientist and funded the research the paper summarizes.

Other health text

There are many types of health-related text generated outside patient care delivery and scientific research. For example, users of consumer wellness applications share health information in text formats as they interact with private groups or company ‘coaches’. Consumers also share health information in public forums like Twitter and Reddit. Government agencies publish public health-related text to their populations, and regulatory bodies publish guidelines and rulings to their constituents. Companies in the healthcare ecosystem train their workforces - whether clinicians or pharmaceutical sales reps - with platforms that deliver content containing health-related text. 

Medical text is difficult to interpret

These sub-categories of medical text have one thing in common - they contain health concepts and terminology and subtle relationships between words and phrases that are difficult for an untrained eye to identify. Those with interest in medicine, healthcare, and biology - whether a seasoned clinician, a medical student, or a medical enthusiast - are best suited to interpret medical text. In this blog series, we’ll focus on NLP that leverages medical text - whether clinical, biomedical or ‘other’ - as this is where some of the most compelling innovation is taking place. 

To learn more, see the following:

  1. Paige Enhances Algorithm with High-Quality Data Annotations
  2. Centaur Labs and the FDA Clearance for VUNO
  3. Richer Medical Data Labels by knowing the power of metadata

Related posts

August 1, 2022

Centaur Labs partners with Lucem Health to advance medical AI

Learn about our partnership with Mayo Clinic spin out Lucem Health, and how clinical AI development teams can access high quality medical data annotations at scale.

Continue reading →
July 8, 2021

Centaur Labs partners with Brigham and Women's on funded project

Learn more about how Centaur Labs is working with the Brigham and Women's Hospital team to develop multiple AI applications for point of care ultrasound.

Continue reading →
August 30, 2022

6 Weekly Practices to Strengthen Your Hybrid Team’s Culture

In the era of hybrid work, creativity and thoughtfulness are key to team success. Learn how we’re helping our team thrive, no matter where they work.

Continue reading →