DENVER, Colo. – Data. Depending on who you ask, it’s either the new gold or the new oil. With the amount of it in existence currently doubling every two years, it’s also far more abundant than either of these resources.
According to digital health luminaries like Dr. Eric Topol, data will empower clinicians to detect, diagnose, and cure disease with more speed, accuracy, and power than ever before. But with providers still learning how to use the enormous datasets now being produced, its promise has yet to be fulfilled.
At last Wednesday’s Talk Data to Me, a digital health entrepreneur, a medical imaging engineer, and an artificial intelligence expert shared how they were using data to improve care. The trio discussed the problems they had faced in gathering and analyzing their data, as well as the solutions they had devised.
“As the healthcare industry has moved from paper-based to electronic records, a ton of data has been produced,” said Kevin Riddleberger, chief strategy officer of DispatchHealth. “Are we leveraging that data as best we can? Probably not.”
A mobile acute care provider, DispatchHealth sends physicians and nurses into the homes of patients, where they treat ailments like bronchitis, nausea, and joint pain. According to Riddleberger, caring for patients in their homes gives DispatchHealth access to an important but often-overlooked source of data.
“We are treating individuals for their acute problems,” said Riddleberger. “But we’re also exposed to a lot of other information about their health.”
“Do they have food in the fridge? Can they bath themselves? Do they have social support? Do they have the financial means to take care of themselves?”
Known as social determinants of health data, this type of information is critical to the welfare of patients but is rarely collected by physicians. Doctors don’t always have the time to ask these questions, and when they do, patients aren’t always willing to answer them. To ensure that providers have easy access to this data, DispatchHealth has started gathering it during visits and storing it electronically.
“We are capturing this information so we can present it to primary care providers, care managers, and payers,” said Riddleberger. “Now they can start acting on it.”
“I tend to think of patients as puzzles,” said Lindsay Quandt, a medical imaging engineer at CereScan. “In order to make the best diagnosis possible, you need to be able to complete the puzzle.”
“Without all the pieces, you can’t make as informed diagnoses, it’s harder to make treatment decisions, and there’s a lot more guesswork involved.”
Before training her machine learning algorithm to analyze the data stored in CereMetrix, CereScan’s brain imaging database, Quandt tried to collect as many puzzle pieces as possible. CereScan already had patients refrain from caffeine and alcohol prior to being imaged to ensure the accuracy of their scans.
To further refine the information in CereMetrix, Quandt had to repair any fragmentary data she found. She also had to flag patients who had diagnoses that didn’t seem to match their scans. Once the database was ready, Quandt’s algorithm began analyzing it in search of previously undiscovered correlations.
“We look at the characteristics of the patient and try to determine what’s relevant for the radiologist,” said Quandt. “Then we use those characteristics to find other patients who share them, and bring back a cohort of patients to provide new information to the radiologist.”
“Maybe there was a treatment path the radiologist was not aware of that similar patients had gone down. Maybe there was a diagnosis they had not considered because they had never seen a patient with it before.”
“How do you train a machine learning application when you don’t have the right data,” asked Jay Swartz, chief science officer of BlackBoxAI. “To train a machine learning application, you have to have unbiased data that reflects the real world.”
Swartz was describing a challenge he had faced as a data scientist at Welltok. The Denver-based digital health company wanted an AI agent that could use natural language processing to answer questions about health insurance benefits. The only problem? Swartz didn’t have enough data to train the agent.
“That’s when I realized that the data doesn’t have to be completely natural,” Swartz said. “It can be approximate to being natural.”
To train the agent, Swartz began creating synthetic data. He started with common benefits questions like, “Where can I get a flu shot?” Then he generated as many permutations of these questions as the English language would allow. (As well as some it didn’t.) The approach created a huge amount of training data.
“Ultimately, the agent achieved 97% accuracy, which is all but unheard of with this kind of technology,” said Swartz. “The lesson here is that you can actually generate your own data if you know how.”
Colorado Health and Tech Mashup provides opportunities for networking, education, and professional development in Colorado’s vibrant digital health community. Learn more here.