Data Winter is Coming : Impact on Healthcare AI

Data Winter is Coming : Impact on Healthcare AI

By Jan De Backer

July 22, 2024

Healthcare AI has the potential to be the biggest leap in medical technology since the invention of the stethoscope or penicil. It promises to transform patient care, streamline diagnostics, and personalize treatment plans. But here’s the kicker: none of this happens without data. And right now, data is getting harder to come by. The Data Provenance Initiative, a research group led by MIT, estimates that today already 25% of the highest quality data on the web cannot be used to train AI algorithms due to use restrictions, a rapid increase compared to even last year [1]. Figure 1 shows that for some domains the percentage is even higher. This trend could be catastrophic for the future of healthcare AI, a very young but rapidly emerging field.

Figure 1: percentage of active physicians who are age 55 or older in 2017 (left) and 2021 (right) Source: AMA Physician Masterfile (December 2017 and 2021).

AI’s Achilles’ Heel: Data Dependence

AI systems are like my three growing boys—hungry… but for data. They need massive, diverse datasets to learn, adapt, and predict. Without enough data, AI models can become biased and unreliable. Imagine an AI trained primarily on data from middle-aged, Caucasian men trying to diagnose diseases in young African-American women. The inaccuracies and biases can lead to misdiagnoses and ineffective treatments as eloquently discussed by Andreotta et al [2] and Murdoch [3].

Trust Issues and Patient Safety

If data access is restricted, the reliability of AI systems plummets, and so does patient trust. The complexity of AI already makes it hard for doctors to explain how decisions are made. Now, add the uncertainty of incomplete or biased data, and you’ve got a recipe for mistrust. Like when an airplane captain would tell passengers he’s not sure the autopilot is using the right maps for navigation. When patients start questioning the safety and accuracy of AI-driven diagnostics, they’ll hesitate to embrace these technologies, potentially leading to poorer health outcomes as discussed by Laura M. Cascella [4]. Furthermore, it will provide the incumbents, who profit greatly from the current sub-optimal sick-care system, with an opportunity to maintain the status quo.

 Innovation on Ice

AI thrives on fresh, real-world data. When data is scarce, innovation stalls. Synthetic data, created by generative models, can fill some gaps, but it’s not a perfect solution. These models still need real data to train on, and their efficacy depends on the quality of that initial data. Without continuous data flow, we risk turning healthcare AI into a stagnant pond rather than a flowing river of innovation.

 Ethical and Legal Quagmire

The ethical implications of restricted data access are enormous. Healthcare providers must ensure AI technologies do not worsen health disparities or violate patient privacy. Legal liabilities soar when data usage is contentious. Transparent data policies and clear consent processes are crucial to navigating these murky waters [2,3]

 The Path Forward

So, what’s the game plan? The healthcare industry must strike a balance between comprehensive data access and stringent privacy protections. Emphasizing patient consent, responsibly leveraging synthetic data, and ensuring ongoing regulatory compliance are non-negotiables. But maybe even more important, tech companies need to team up with care providers, preferably in novel out-patient networks of clinical centers, to ensure a steady stream of quality data for AI. Integrating tech with healthcare delivery ensures access to real-time, diverse data, essential for accurate AI algorithms. This collaboration isn’t just smart—it’s necessary. It fuels innovation, improves patient outcomes, and keeps AI systems sharp and unbiased. Without this synergy, we risk stagnation, bias, and continued inefficiency. Tech and healthcare together can transform patient care, driving forward the next wave of medical breakthroughs.

In sum, healthcare AI’s future hangs in a delicate balance. Data is the lifeblood of AI, and as access becomes restricted, we face a critical juncture. By prioritizing ethical practices, regulatory compliance in a integrated technology/care delivery system, we can navigate this landscape and harness AI’s transformative potential for the betterment of all patients.

The stakes are high, and the time to act is now.

[1] https://www.dataprovenance.org/Consent_in_Crisis.pdf

[2] Andreotta, A.J., Kirkham, N. & Rizzi, M. AI, big data, and the future of consent. AI & Soc 37, 1715–1728 (2022). https://doi.org/10.1007/s00146-021-01262-5

[3] Murdoch, B. Privacy and artificial intelligence: challenges for protecting health information in a new era. BMC Med Ethics 22, 122 (2021). https://doi.org/10.1186/s12910-021-00687-3

[4] https://www.medpro.com/artificial-intelligence-informedconsent