Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Copyright © 2025. All rights reserved by Centaur Labs.
Blog
Open source data can be very valuable when starting to think through building a medical data labeling pipeline (more on that in our free white paper)
We thought it would be helpful to put some of our favorite open source datasets in an organized list and share them out to the community.
In our list, you can explore dozens of datasets by size, category, modality (including X-ray, Ultrasound, Whole Slide Images, CT Scans, ECGs) and more. Additionally, we have included a brief description that helps you to quickly understand the specific abnormalities of interest, balance of the data and information about annotations included such as medical image classifications or segmentations.
‍
‍
Access the full collection Access the full collection here.
If you know of any datasets that should be added to this list, please let us know!
Medical assessments are rarely black and white. To handle the grey, we offer a rigorous, data-driven approach to QA.
Uncover the essence of Centaur Labs, a pioneer in combining human and machine intelligence for superior medical data labeling in the evolving healthcare landscape.
Explored data curation strategies to mitigate bias in medical AI, with a focus on diverse datasets, expert input, and ensuring fairness in results.