Our topic is predictive analytics for research. Collecting it, preparing it, analyzing it, and protecting it are the Nemours Children’s Biomedical Research Informatics Center (BRIC) realm. BRIC provides consultation, training, and computational resources to biomedical research investigators across the enterprise and beyond. Our guests are BRIC Director Dr. Timothy Bunnell and Daniel Eckrich, BRIC’s Supervisor for Research Applications.
Carol Vassar, producer
Tim Bunnell, Director, Nemours Children’s Biomedical Research Informatics Center and Director, Center for Pediatric Auditory Sciences
Daniel Eckrich, Supervisor for Research Applications, Nemours Children’s Biomedical Research Informatics Center
Carol Vassar, podcast host/producer (00:00):
Welcome to Well Beyond Medicine: the Nemours Children’s Health Podcast. Each week we’ll explore anything and everything related to the 80% of child health impacts that occur outside the doctor’s office. I’m your host, Carol Vassar, and now that you are here, let’s go. (SINGING) (00:26): Well beyond medicine!
Carol Vassar, podcast host/producer (00:27):
Today we’re talking to two more Nemours’ associates committed to transforming children’s health well beyond medicine. Our topic is big data applied to healthcare through predictive analytics. I’m joined by Dr. Timothy Bunnell, the recently retired Nemours’ Biomedical Research Informatics Center or BRIC director. And Daniel Eckrich, BRIC’S supervisor for research applications. BRIC provides consultation, training and computational resources to biomedical research investigators across the enterprise and beyond. Dr. Bunnell gives us a bit of BRIC’s history. (01:11):
Dr. Timothy Bunnell, Nemours Children’s Health (01:12):
It started out being called the bioinformatics core, and it was intended to provide the sort of data storage and data analysis capabilities that are really necessary these days for biological and medical data. The data has just exploded into the 21st century, so there’s massive amounts of data to be stored and analyzed and try to make sense of. I think the name BRIC started sticking with us in the early two thousands as we grew into a larger center that provided larger amounts of service. So we now work with Nemours’ [inaudible 00:01:47] I guess I should say, in developing and maintaining the high performance computing cores for the research department and the massive data storage for the research department. So we have machines that are now capable of running hundreds or even thousands of arithmetic operations in parallel and storing, I think right now, about 1.5 petabytes of data. And that’s only going to grow as the data continues to grow, particularly with things like genome sequencing and microscope data, which is very massive, so…
Carol Vassar, podcast host/producer (02:23):
Researchers are your main focus and your main customer, if you will.
Dr. Timothy Bunnell, Nemours Children’s Health (02:31):
Very, very heavily research oriented, but not entirely because we have worked with folks in the value-based service organization within Nemours. Trying to do things like look at the likelihood that a patient who’s released from the hospital return to the hospital again within 30 days in an unexpected manner. And trying to predict that sort of behavior is one of the goals of the sort of work that we do in BRIC with using the data that we’ve collected from the health records.
Carol Vassar, podcast host/producer (03:00):
And that’s one of the thresholds CMS looks at when they’re looking to make certain that there is quality within a healthcare facility, right?
Dr. Timothy Bunnell, Nemours Children’s Health (03:09):
Absolutely. And you want to be able to predict what’s the cause of that and then be able to hopefully jump in a short circuit whatever the causes are that are causing people to return so frequently.
Carol Vassar, podcast host/producer (03:25):
Which leads us very nicely into the question I want to ask. What is predictive analytics? I think you really just gave us a great example of it. What ingredients are needed for predictive analytics and how is it applied in a healthcare setting? I think that was just a great example. Are there others?
Dr. Timothy Bunnell, Nemours Children’s Health (03:44):
Well, there are plenty of them. Pick any disease and the way that disease has been treated historically and whether there are things that we could be doing better to treat that disease or that outcome for children. So you asked what’s needed. Well, the first thing that’s needed above all else is good data and lots of it. The more data we have and the more history we can bring to bear on care in the past, the better we can predict how care will unroll in the future with patients. And one of the big sources of data, as I’ve mentioned, is our electronic health records where we have all of the discrete facts about patients. What conditions they’ve been diagnosed with, what medications they’ve been given, what kind of procedures have been performed on that patient in the past. Things like that are the facts that we can use to characterize each patient and then do predictions on that patient’s likelihood of different disease outcomes or as we mentioned earlier, returning to the hospital after release.
In addition to those hospital facts though, and this is where Dan’s expertise has really been useful to us. We’re discovering that some of the social determinants of health are really crucial to trying to figure out what are likely outcomes for patients. Dan has now managed to do geolocating on a large proportion of our patients. And well, Dan, I should let you talk about this because you understand it much better than I do.
Daniel Eckrich, Nemours Children’s Health (05:14):
Sure. I think it ties in with the whole precision medicine focus of both the clinical record on a patient, the environment, as well as behaviors. So with the GIS coding that we’ve been doing to geolocate all the patients, it’s over, as Tim mentioned before, I believe 3 million patients. And with historical addresses, we have over 9 million addresses and we’ve gone ahead and run all of those through a software program to find the latitude and longitude of those patients.
And with that, we can use the census shape files to go ahead and actually place those patients in census tracks, census block groups, some of the geo identifiers that they use. And then taking that to the next step, there’s all of the census data and American Community Survey data about the environments that patients are raised in. And there are the indices that have been developed like the Child Opportunity Index, the Area Deprivation Index, Social Vulnerability Index. There’s all of these vetted composite scores related to the area where patients are living that we can also bring in to their clinical record. And when we’re performing large scale big data analytics, we can apply those as well to the research that we’re doing and see how they factor into some of the outcomes that we’re looking at.
Carol Vassar, podcast host/producer (06:35):
How does that change the services we provide at the patient level?
Daniel Eckrich, Nemours Children’s Health (06:42):
I think that this is a relatively new area that we’re looking at. I think ideally you’d be able to identify the patients who are at higher risk for, as mentioned before, a readmission or a higher risk of adverse events that may take place. And if you can develop an intervention or help with clinical decision support, it can benefit and have more positive outcomes for the patient. Ideally.Dr. Timothy Bunnell, Nemours Children’s Health (07:11):
So we can identify patients who live in food deserts as they’re called, areas where there really isn’t healthy food easily available. They may even say a child with asthma who may live very close to or some sort of chemical plant or something that’s actually producing extra amounts of pollution in the atmosphere. And if we can identify some of these things like the transportation issues, we can help with getting kids to the hospital when they need to get here, calling and making sure that they have a way to get here when their appointment is occurring. Very simple, completely nonmedical, but absolutely crucial thing is to get to the medical appointments that you’ve made in order to be able to be treated.
Carol Vassar, podcast host/producer (07:53):
And that very much speaks to the benefits of predictive analytics in healthcare and an application that I think most people aren’t even aware is available. What other benefits are coming from predictive analytics?
Daniel Eckrich, Nemours Children’s Health (08:06):
I mean, I think some of the potential as well is being able to allocate resources to identify areas where maybe we could have more resources available, more staff available and all that. If you’re looking at very much the nuts and bolts kind of process in addition to just patient care, also how the hospital itself operates.
Dr. Timothy Bunnell, Nemours Children’s Health (08:26):
And right now we’re having a staffing shortage because of all the respiratory viruses that are going around among kids. Being able to predict that in advance and being able to adjust your staffing levels to anticipate that wave of things would be great. I can’t say that we’ve succeeded all that well in this particular case. We were looking elsewhere, if you will, at the time. But another one of the things we’ve looked at a lot is the use of predictive analytics to assess the comparative effectiveness of different methods of treatment. And we’ve done that, for instance, with COVID in particular. That’s been one of the large emphasis areas for the medical research these days, obviously the pandemic has been a lot of that. For instance, we’ve been able to use machine learning algorithms to analyze massive amounts of data on COVID patients and look at what are the combinations of therapeutic agents that are given to those patients that lead to better outcomes for the patients. Fewer deaths and shorter hospital stays, more kids going home sooner.
Carol Vassar, podcast host/producer(09:36):
Now, Tim, you said something earlier, very early in the interview about having good data is really the main ingredient of predictive analytics. Define what you mean by good data. Maybe that’s a basic marketing term, but looking at data and making sure that it is good versus not good. How does that happen?
Dr. Timothy Bunnell, Nemours Children’s Health (09:57):
Well, unfortunately it does not happen automatically or easily. It’s a major effort and electronic health record data is especially challenging in that regard, really. If you think about it, all of those data that we have in our electronic health records were collected for the purposes of operating a hospital, doing billing, collecting things that you need to collect. They’re not really intended to be used for research purposes. And so when we pull those data out, there’s a lot of additional analyses that we have to do and care that we have to take to be sure that if we have decided that a patient has a diagnosis of type two diabetes. That it really is a diagnosis of type two diabetes and not something that just has one or two condition codes that are somehow related to that diagnosis. For instance, a physician might have ordered some analyses that would be related to type two diabetes simply because they wanted to eliminate that as a diagnosis rather than that that is a positive diagnosis.
So this is a whole research area really called computable phenotypes, where we’re trying to figure out how to take the electronic health record data and compute from that an accurate phenotype for a patient. An accurate description of what that patient is really like, as opposed to all of the potential noise that comes in there from looking at billing codes rather than carefully vetted diagnosis.
Daniel Eckrich, Nemours Children’s Health (11:35):
I think Tim touched on one thing, and I’m probably just going to reemphasize it because it’s such a major part that I think goes unrecognized and underappreciated. It is just the amount of data quality analysis and validation that is necessary to make this work. Predictive analytics and what comes out of it is very eye-catching and draws a lot of attention, but there is a tremendous amount of work that goes on behind the scenes to just get to the point where we can do some of this work that I’d like to point out and just make sure that people do understand the effort.
Dr. Timothy Bunnell, Nemours Children’s Health (12:10):
I would call out in particular our systems administrator who just works tirelessly to keep this high performance computing stuff working and finding the best new ways to do it. And without all of that support organization, and frankly, the support of Nemours’ IS as well because they’re at the very base of things responsible for maintaining the network are absolutely crucial to what we do.
Carol Vassar, podcast host/producer (12:37):
On your team, I’m hearing Dan and Tim and a couple of others, I mean, that’s a pretty small team. Did I hear that correctly and interpret that correctly?
Dr. Timothy Bunnell, Nemours Children’s Health (12:46):
It is more than a couple. Just to list the folks, in addition to Dan and I, we have Dr. Suzanne McCann who’s a molecular biologist and [inaudible 00:12:57]. Chris Pennington, who’s a computer scientist and who maintains our systems and databases for us. Michael Peck, who is a web developer and does a lot of the programming that allows us to display data on webs and collect data via web forms and things like that.
Daniel Eckrich, Nemours Children’s Health (13:14):
We have a small footprint, especially relative to some other institutions, but I do think that we operate effectively and efficiently as well with the small team.
Carol Vassar, podcast host/producer (13:25):
When we talk about individuals and bringing this right down to where they live, maybe where they work, where they play, where their parents work, it raises the idea of confidentiality. How do you assure that everything that you are doing in terms of predictive analytics and with the data that you’re working with is kept secure?
Dr. Timothy Bunnell, Nemours Children’s Health (13:51):
So a big part of the process of getting the data into our research database is called extraction translation and loading. We extract the data from the electronic health records, which is all identifiable personal health information in many cases. And we have to transform that data to a format that is less likely to contain any personal information. Our research database does contain accurate dates of patients and it contains accurate location information on patients, accurate down to a census block group actual. But it does not contain the patient name, social security numbers, credit card numbers, things like that are all scrubbed out of the database. Our identifiers are all just randomly assigned integers. Every patient has this integer that we use, which without a fairly elaborate process, can’t be mapped back to who the patient really is.
And then further, when we’re developing analytic data sets to use in statistical analyses or something like that, we take a couple of additional measures, for instance with dates. Rather than use the actual dates like the patient’s birthdate and a date of a visit will skew those dates by a random amount of time, always assigning the same random skew within a patient to all of the dates for that patient. But every patient has a different random skew. So you can look at durations like the time from birth to their first hospital visit. That duration measurement will be accurate given the date skew, but the dates of that birth and first visit are different than the real dates of that first visit. Statistically, it’s very difficult to be able to reverse engineer the data and actually get back to identify.
Carol Vassar, podcast host/producer (15:50):
Which I know if parents are listening right now is a great comfort. They know that it is not identifiable directly back to their child. Dan, tell us about some of the research that you might know of that’s being conducted or maybe already published that’s involved predictive analytics.Daniel Eckrich, Nemours Children’s Health (16:09):
Some of the work going on at Nemours, one of the biggest projects, I guess we are collaborating with the University of Delaware. Dr. Bunnell, Dr. Fan, Dr. [inaudible 00:16:19] at University of Delaware, Mehak Gupta. We’re looking at pulling in electronic healthcare record as well as social determinants of health. We’re trying to do a predicting obesity outcomes and trying to provide interventions with the patients. I know that was one of the biggest projects that we had worked on. Additionally, Tim had mentioned about the unexpected readmission work that we were collaborating with the VBSO group on. There is another project that I just started with involving the National COVID Cohort Collaborative, which is another large network that Nemours is participating in through the Delaware CTR.
Looking at trying to predict severe mental health outcomes along the lines of bipolar disease and schizophrenia to see if there is an increase based on a COVID diagnosis and what factors may be impacting that outcome as well. So this is a relatively new study that’s just getting ramped up right now. Those are the ones that are right off the top of my head, probably continue going on for quite a while on this.
Carol Vassar, podcast host/producer (17:23):
Tim, anything you want to highlight?
Dr. Timothy Bunnell, Nemours Children’s Health (17:26):
I’ll mention also some of the work that we’re doing again related to COVID, but this time related to long COVID, which is the conditions that occur for some people after they’ve had COVID and gotten over it. They have all kinds of additional comorbidities or ailments that seem to have been triggered by having had COVID. And one of the things that my team here at Nemours is particularly working with on the project is the natural language processing components of the electronic health records. For that study, we’ve actually taken a sample of physicians’ notes from the Nemours’ patients and from other institutions who are collaborating with us on this project. And we first of all run those notes through a process that de-identifies the notes’ data by removing names or replacing them with alternate names so that you can’t tell who the patient is or who the physician is.
It also masks out dates and phone numbers and things like that. So these notes have been de-identified and then we’re using natural language processing to search for functional outcomes in the notes. And this is things like children missing school, having difficulty concentrating, brain fog being one of the known outcomes of long COVID. And attempting to find things in the notes that wouldn’t necessarily have made it into the discreet data elements in the electronic health records.
Carol Vassar, podcast host/producer (18:52):
Tim, what’s your vision for BRIC moving forward next one to two years and maybe five years and more?
Dr. Timothy Bunnell, Nemours Children’s Health (19:01):
Well, one of the things that we’re trying to do is grow our data science capabilities here. We were lucky enough to have hired one data scientist in the last year, and we are collaborating now with people in some other departments to hire additional data scientists just simply because the amount of work to do far exceeds the amount of time that Dan and I and the couple of other people who are working on this already have available to spend. So I see this as an area that’s going to expand rapidly. We’ll be talking with folks in the quality improvement areas and the value-based services organization, looking at social determinants. So the more of these different components of machine learning and predictive analytics we bring in the more people we’re going to need to address it.
Carol Vassar, podcast host/producer (19:51):
Dan, I want to ask you, how do you think BRIC fits into the Nemours vision of Well Beyond Medicine?
Daniel Eckrich, Nemours Children’s Health (19:59):
So I think if we’re looking at the mission statement behind Well Beyond Medicine, if I remember it correctly, it’s 80% of healthcare happens outside of the medical setting. And I think that a lot of the work that we’re doing, particularly with the social determinants of health, socioeconomic status, geocoding. Additionally, if you want to throw in some of the natural language processing that we’re discussing about evaluating the comments and notes being made with patients. I think we are working at expanding beyond just the clinical record of a patient, that we are trying to look at the whole picture of the patient, their entire environment in addition to their medical and clinical data, and be able to use that to improve outcomes and improve patient care.
Carol Vassar, podcast host/producer (20:45):
Tim, what do you think? How does BRIC fit into the Well Beyond Medicine vision?
Dr. Timothy Bunnell, Nemours Children’s Health (20:51):
There’s so much that goes on outside of the medical care domain that’s really crucial to the health of children and children are really crucial to the health of the nation as a whole. So we need to focus on what we can do to address some of these things that fall sort of outside the regular domain of healthcare.
Carol Vassar, podcast host/producer (21:13):
Thanks for listening to our episode on Predictive Analytics in Healthcare with me, Carol Vassar and our guests, Dr. Timothy Bunnell and Daniel Eckrich. So what do you think? Is Big Data the way of the future in medicine? Where do you see opportunity in terms of predictive analytics? Visit nemourswellbeyond.org to submit a comment or leave us a voicemail. While you’re there, check out our other episodes and be sure to subscribe to the podcast. Thanks to Che Parker, Cheryl Munn, Rachel Salis-Silverman, Benjamin Duong, and Savannah Pettit for production assistance on this episode. Next week, join us to find out how Nemours is partnering with the community to empower teens and young adults to navigate the healthcare system. Until then, remember, together we can change children’s health for good, well beyond medicine. (22:15):