Why is machine learning trending in medical research but not in our doctor’s offices?

Penn Integrates Knowledge Professor Konrad Kording will lead Penn’s NIH-funded cohort for making advancements in the field of machine learning in biomedical research by creating the Community for Rigor, which will provide open-access resources on conducting sound science.

Machine learning (ML) programs computers to learn the way we do—through the continual assessment of data and identification of patterns based on past outcomes. ML can quickly pick out trends in big datasets, operate with little to no human interaction and improve its predictions over time. Due to these abilities, it is rapidly finding its way into medical research.

People with breast cancer may soon be diagnosed through ML faster than through a biopsy. ML may also help paralyzed people regain autonomy using prosthetics controlled by patterns identified in brain scan data. ML research promises these and many other possibilities to help people lead healthier lives. But while the number of ML studies grow, the actual use of it in doctors’ offices has not expanded.

The limitations lie in medical research’s small sample sizes and unique datasets. This small data makes it hard for machines to identify meaningful patterns. The more data, the more accuracy in ML diagnoses and predictions. For many diagnostic uses, massive numbers of subjects in the thousands would be needed, but most studies use smaller numbers in the dozens of subjects.

But there are ways to find significant results from small datasets if you know how to manipulate the numbers. Running statistical tests over and over again with different subsets of your data can indicate significance in a dataset that in reality may be just random outliers.

This tactic, known as P-hacking or feature hacking in ML, leads to the creation of predictive models that are too limited to be useful in the real world. What looks good on paper doesn’t translate to a doctor’s ability to diagnose or treat us. These statistical mistakes, oftentimes done unknowingly, can lead to dangerous conclusions.

To help scientists avoid these mistakes and push ML applications forward, Konrad Kording, a Penn Integrates Knowledge University Professor with appointments in the the Department of Neuroscience in the Perelman School of Medicine and in the Departments of Bioengineering and Computer and Information Science in the School of Engineering and Applied Science, is leading an aspect of a large, NIH-funded program known as CENTER – Creating an Educational Nexus for Training in Experimental Rigor. Kording will lead Penn’s cohort by creating the Community for Rigor, which will provide open-access resources on conducting sound science. Members of this inclusive scientific community will be able to engage with ML simulations and discussion-based courses.

“The reason for the lack of ML in real-world scenarios is due to statistical misuse rather than the limitations of the tool itself,” says Kording. “If a study publishes a claim that seems too good to be true, it usually is, and many times we can track that back to their use of statistics.”

To make meaningful advancements in the field of ML in biomedical research, it will be necessary to raise awareness of these issues, help researchers understand how to identify them and limit them, and create a stronger culture around scientific rigor in the research community.

Kording aims to communicate that just because incorporating machine learning into biomedical research can introduce room for bias doesn’t mean scientists should avoid it. They just need to understand how to use it in a meaningful way.

The Community for Rigor aims to address challenges of the field with specific plans to create a module on machine learning in biomedical research that will guide participants through datasets and statistical tests and pinpoint exact locations where bias is commonly introduced.

This story is by Melissa Pappas. Read more at Penn Engineering Today.

Credits

Writer

From Penn Engineering Today

More from

School of Engineering & Applied Science

Bioengineering

Penn Integrates Knowledge Professors

Computer Science

Recent Articles

People gather around a large map placed on the floor.

Global

From a desert to an oasis: Penn engages in ambitious greening effort in the Sahel

Students from the Weitzman School of Design journeyed to Senegal to help with a massive ecological and infrastructural greening effort as part of their coursework. The Dakar Greenbelt aims to combat desertification and promote sustainable urban growth.

People looking at the After Modernism exhibit at the Arthur Ross Gallery.

Arts, Humanities, & Social Sciences

The practice of art collection as a collaboration

As part of an undergraduate course, Penn faculty and students curated an Arthur Ross Gallery exhibition of works from the Neumann family’s extensive collection of modern and contemporary art.

Scientists holding a model of something (forthcoming)

Campus & Community

Penn Center for Innovation celebrates 10 years

The University’s nexus for technology transfer supports researchers in their innovative efforts, from CAR T to mRNA advancements that have dramatically reshaped the world.

The exterior of the Vagelos building lit up with dramatic lighting.

Technology

An illuminating celebration to a brighter, greener future

Members of the Penn community celebrated an energy research milestone: the unveiling of the new Vagelos Laboratory for Energy Science and Technology.

Share this article