Blog

Our collection of open source datasets for medical AI

Author Image
The Centaur Blogging Team
December 3, 2020

We live and breathe medical datasets for AI!

Open source data can be very valuable when starting to think through building a medical data labeling pipeline (more on that in our free white paper)

We thought it would be helpful to put some of our favorite open source datasets in an organized list and share them out to the community.

In our list, you can explore dozens of datasets by size, category, modality (including X-ray, Ultrasound, Whole Slide Images, CT Scans, ECGs) and more. Additionally, we have included a brief description that helps you to quickly understand the specific abnormalities of interest, balance of the data and information about annotations included such as medical image classifications or segmentations.

‍

Our collection of open source datasets for medical

‍

Access the full collection Access the full collection here.

If you know of any datasets that should be added to this list, please let us know!

Related posts

December 20, 2023

Webinar: Boosting AI in Healthcare – The Key Role of Expert Feedback

Learn why expert feedback is key for AI success in healthcare. Sign up for our easy-to-understand webinar and make your AI better.

Continue reading →
January 13, 2025

Microsoft Research, Alicante release PadChest-GR with Centaur Labs.

Learn PADChest GR, a new CXR dataset for GenAI by Microsoft Research & University of Alicante, developed with Centaur Labs' expert support.

Continue reading →
August 30, 2022

6 Weekly Practices to Strengthen Your Hybrid Team’s Culture

In the era of hybrid work, creativity and thoughtfulness are key to team success. Learn how we’re helping our team thrive, no matter where they work.

Continue reading →