Blog

Cognitive-Inspired Data Engineering For AI

Gunnar Epping, Research Scientist

June 15, 2025

At Centaur.AI, we’re always seeking ways to improve data quality in medical AI. In a groundbreaking new study, researchers have taken a significant step forward by addressing one of AI’s biggest hurdles: human bias in crowdsourced data.

‍

The Problem: Human Bias in Machine Learning

‍

Crowdsourcing platforms like Amazon Mechanical Turk and Centaur.AI's own are essential for rapidly labeling large datasets. However, these datasets can carry the biases of the annotators who provide the labels. This issue resonates in critical areas like medical diagnosis, where getting it right can be a matter of life and death. When we train machine learning models on data that reflects human biases—like overconfidence, wrong probability estimates, or deep-rooted systemic issues—it can undermine how well these models function. Awareness of these biases is crucial, as they can have serious consequences in real-world applications. This is particularly concerning in critical domains like medical diagnosis, where accuracy is paramount.

‍

The question is, how can we reduce this bias to create more accurate and reliable data?

The Solution: Cognitive-Inspired Data Engineering

‍

The study leverages a technique called recalibration, which adjusts subjective probability judgments made by human annotators. The researchers use a model called the Linear Log Odds (LLO) function to transform biased judgments into more objective data. This process is part of cognitive-inspired data engineering, which applies cognitive science principles to improve data quality and, by extension, ML model performance.

‍

The Core Idea Behind Recalibration

‍

Cognitive science tells us human judgment is often flawed, particularly when assigning probabilities to uncertain events. For example, people tend to be overconfident in their classifications or systematically underweight rare events. Models like the LLO function can help improve our guesses about probabilities. This means we can get more reliable labels to train our machine-learning models, leading to better overall results.

‍

What the Study Found

‍

To test this approach, the research team conducted two experiments to evaluate how recalibration affects data quality and machine learning model performance. The two experiments used the exact same set of images.

Experiment 1: Novice Annotators on MTurk

Task: Novice participants labeled medical images, such as peripheral blood cells.
Process: Participants provided probability-based answers that were recalibrated using the LLO function.
Results:
- Recalibrated crowd labels improved overall accuracy from 81.6% to 85.1%.
- The recalibration process successfully reduced overconfidence and systematic biases.
- However, recalibration had a more negligible effect on individual classification accuracy

Experiment 2: Skilled Annotators on DiagnosUs

‍

Task: Skilled medical annotators labeled the same type of images.
Process: Their responses were also subjected to recalibration using the LLO function.
Results:
- The accuracy of crowd labels improved from 88.3% to 96.7%.
- The impact of recalibration was much more pronounced for skilled annotators than for novices.
- This suggests that recalibration is particularly effective in high-expertise domains like medical AI.

‍

Key Insights from the Study

‍

Recalibration Works: By adjusting probability judgments, researchers significantly improved the accuracy of crowdsourced labels.
Efficiency Gains: More judgments typically lead to higher accuracy, but recalibrated labels reached optimal accuracy with fewer annotations than raw labels.
Better ML Training Data: Models trained on recalibrated datasets outperformed models trained on non-recalibrated data, especially when the number of judgments was low. This is particularly relevant in real-world applications where annotations are costly and time-consuming.

‍

Why This Matters for AI and Medical Applications

‍

The findings of this study have far-reaching implications for the future of machine learning, particularly in domains where data accuracy is critical. Medical AI systems that learn from crowdsourced data can significantly benefit from recalibration techniques. These improvements can lead to more accurate diagnostic tools, better decision-support systems for healthcare professionals, and, ultimately, healthier patient outcomes. By refining these systems, we can enhance their effectiveness and reliability, making a real difference in people's lives.

‍

1. More Reliable Medical Diagnoses

AI models used for detecting diseases, interpreting radiology scans, and classifying pathology images rely heavily on labeled training data. If those labels contain systematic biases, the model will inherit and amplify those errors. Recalibration helps mitigate this issue by refining the labels before training even begins.

‍

2. More Efficient Use of Annotators

Medical professionals’ time is valuable. If we can achieve higher-quality labeled data with fewer annotations, we can reduce annotation costs while maintaining or improving ML models' quality. This efficiency is crucial for startups and research teams operating under resource constraints.

‍

3. Reducing Bias in Other High-Skill Domains

While this study focuses on medical AI, the principles of cognitive-inspired data engineering apply broadly to other fields, including:

‍

Financial risk modeling (reducing cognitive biases in credit assessments)
Legal AI applications (improving document classification and case law research)
Autonomous vehicles (refining human-annotated driving behavior datasets)
Defense and security (enhancing intelligence analysis through bias reduction)

‍

The Future of Cognitive-Inspired Data Engineering

‍

This study represents a major step forward in tackling bias in AI training data, but it also raises further questions for future research:

‍

Can we develop even more sophisticated recalibration models beyond LLO?
How do different cognitive biases affect labeling accuracy across various fields?
Can active learning techniques be combined with recalibration to optimize data collection further?
What ethical considerations arise when modifying human-provided labels?

‍

Conclusion

‍

Cognitive-inspired data engineering is an innovative method that boosts the reliability of labels gathered from crowdsourcing. This significantly enhances the performance of machine learning models that rely on these labels. We can systematically reduce biases by leveraging techniques like recalibration, leading to more efficient data collection, better model performance, and more accurate AI applications.

‍

At Centaur.AI, we believe in the transformative power of high-quality data. Integrating cognitive science into data engineering will be essential for unlocking its full potential—especially in critical fields like medical diagnostics as AI evolves.

‍

For AI to truly benefit humanity, it must be trained on data that objectively reflects reality. Cognitive-inspired data engineering helps make this possible.

For a demonstration of how Centaur can facilitate your AI model training and evaluation with greater accuracy, scalability, and value, click here: https://centaur.ai/demo

References

Improving human and machine classification through cognitive-inspired data engineering

Amazon Mechanical Turk

‍

October 20, 2025

Content Moderation in the Age of AI: Why High-Quality Data is the Real Game-Changer

Content moderation depends on more than AI automation—it requires high-quality training data. Centaur.ai delivers expert-labeled, multimodal datasets that help platforms detect hate speech, disinformation, explicit content, and compliance risks. By combining human insight with scalable infrastructure, Centaur.ai builds safer, more ethical, and more adaptable moderation systems.

September 2, 2022

Consensus teams up with Centaur.ai to create data labels

The new AI-powered scientific search engine, Consensus, partners with Centaur.ai to generate high-quality, scalable scientific data labels for research.

September 3, 2021

Announcing $15M Series A Funding to Annotate Global Medical Data

We are so humbled and excited to share our recent $15M Series A funding round led by Matrix Partners!

Accurate and scalable data labeling and model evaluation

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Subscribe to our monthly newsletter