Doctors unlock medical mysteries in mounds of data

Dan Browning / Minneapolis Star Tribune /

Published Aug 11, 2013 at 05:00AM

MINNEAPOLIS — It’s hard to see the future of medicine through the scabs, blisters and scars that torment 7-year-old Charlie Knuth as he makes his way haltingly to a checkup at the University of Minnesota’s Amplatz Children’s Hospital. But the boy from Appleton, Wis., is helping doctors perfect a pioneering intervention called gene editing, a procedure that could lend hope to thousands of people suffering from hundreds of diseases.

They include epidermolysis bullosa, the disorder that causes Charlie’s skin to shear off and his eyes to blister. Charlie’s case also illustrates the power of an emerging field called “biomedical and health care informatics” that’s beginning to revolutionize every aspect of medicine, from laboratory research to clinical treatments.

The doctors helping Charlie — a team that includes scientists at the university, in Massachusetts and in Germany — couldn’t have done their work without mining a massive genomic database that enabled them to interpret millions of bits of data in the boy’s DNA, according to Dr. Jakub Tolar, director of UM’s Stem Cell Institute.

That, in turn, allowed them to cut out a single, defective gene and splice in a correction without damaging side effects.

‘Big data’

The procedure, which they described in a recent issue of the journal Molecular Therapy, is part of a larger movement that has medical professionals collaborating with physicists, mathematicians, statisticians, social scientists and computer engineers in an effort to create and mine “big data” centers. Much as Google, Facebook and Amazon mine massive amounts of data to discern consumer preferences, these researchers are sifting huge quantities of medical data to diagnose, understand and cure diseases.

UM, the Mayo Clinic and several Minnesota businesses are well-positioned to take advantage of the trend. Five years ago, the university launched a special graduate program in Biomedical Informatics and Computational Biology. Partners include its Twin Cities and Rochester campuses, the Hormel Institute, Mayo, IBM, the National Marrow Donor Program and a brain research center at the Minneapolis Veterans Medical Center. And three years ago the university received a $5.1 million federal grant specifically to train health professionals in informatics.

Biomedical informatics starts from a simple premise: The human body represents a databank of stunning depth and complexity.

By 2015, the average hospital will have nearly 450 terabytes of patient data — most of it in the form of large, complex images from CT scans, MRIs and similar imaging techniques, according to researchers at IBM and Wayne State University.

Beyond that are myriad other digital streams that could be tapped, such as Facebook and Twitter posts, which have proved useful in epidemiological studies, or monitoring devices such as Microsoft Kinect, which is being studied to understand movement disorders such as Parkinson’s disease.

And the stock of digital data will roughly double in volume every two years, according to a recent study sponsored by EMC Corp., a Massachusetts data storage and computing company.

Analyzing the information

Yet only a small fraction of existing data has been analyzed, which creates a huge job growth opportunity.

“We go from data, to information, to knowledge, to wisdom,” Tolar said. “And unless we have a very systematic way of looking at the data, we will not only lose a lot of the information, but also, we will do harm, in my opinion.”

The Obama administration put up $200 million last year for an initiative to improve medical care and cut costs by mining the growing stores of health data. At the National Institutes of Health, a program called Big Data to Knowledge, underwrites projects such as mapping every neuron connection in the brain and large-scale genome sequencing of cancerous tumors.

“The goal is to develop new tools to analyze, organize and standardize all this data, so that it is easy for scientists to share and access,” NIH director Dr. Francis Collins explained.

Connie Delaney, dean of the university’s School of Nursing in Minneapolis and acting director of the Institute for Health Informatics, says the application of data mining to health care represents “a fundamental paradigm shift” that affects every scientific discipline and requires unprecedented collaboration to tap the breadth of skills required.

The university’s new BICB graduate program has 50 students enrolled; more than half are health care professionals who’ve recognized the need to acquire data analysis skills, said Claudia Neuhauser, a distinguished mathematician who directs the program.

Massive data sets require new tools of analysis, like the predictive modeling that Amazon uses to recommend certain books to customers, she said.

Biologists, she said, should learn “enough of the quantitative tools that they can analyze the data in a meaningful way,” Neuhauser said. “The onus is on (us) to develop ways of teaching so that biologists can fruitfully use the tools.”

In the past, scientists started with a hypothesis, then collected and analyzed the data to test the question that they asked, Neuhauser said; now they wade into massive data sets they already have, looking for ways to optimize treatment.

Analyzing the 3 billion base pairs of four letters that make up the human genome may seem complicated enough, but even more challenging, she said, “is the whole imaging piece.” Digital images from scans and high-tech processes like X-ray crystallography require huge databanks. And they are difficult to link to other data types, Neuhauser said.

But electronic health records are already being analyzed to ensure that patient care is cost-effective, said Bonnie Westra, a former software company founder who coordinates the informatics specialty within the university’s Twin Cities nursing program.

One study of 500,000 patients proved that certified nurses “absolutely” make a difference in the quality of care for patients suffering from incontinence, pressure ulcers and surgical wounds, she said.

The same database is now being mined for ways to predict which patients are likely to be readmitted after being released from a hospital.

Helping Charlie

In Charlie Knuth’s case, Big Data helped unlock the genetic code so that researchers could use molecular scissors to precisely cut out a single letter in his faulty genome and replace it with the correct one. Mark Osborn, an assistant professor at the univerity’s Pediatric Blood and Marrow Transplant Center, was the lead author in a recent peer-reviewed article in the journal Molecular Therapy describing the procedure.

The result: For the first time, Charlie’s skin cells began producing the “Type VII collagen” fibers that act like Velcro to anchor the skin in place.

Tolar said his team used the “heavy guns” of biomedical informatics and an advanced German genomics databank to demonstrate that the procedure would meet federal clinical standards as effective and safe. He now plans to seek approval to try it in humans.

“What I’m engaging is the DNA repair system that’s already operational in the cell,” Tolar said. “I’m just offering it some tools to repair itself, and that’s why it’s efficacious, right? Because most elegant things come from nature.”