Inside the Data Scientist’s Tool Box: How to Maximize Information
By Sarianne Gruber of RCM Answers.
Paul Bradley, Ph.D. is the Chief Data Scientist at ZirMed, Inc. Having completed both his undergraduate and graduate studies at the University of Wisconsin-Madison in Mathematics and Computer Science, he is skilled in data analysis technology, data mining and strategic business decisions. Prior to ZirMed, he was Chief Data Scientist for Method Care. Paul recently delivered an educational session at the HIMSS 2015 Annual Conference held in Chicago on “Using Data Analytics for Improving Productivity”. His presentation examined how Northwestern Memorial Hospital implemented a combination of technology improvements and best practices to reduce manual tasks, increase productivity and maximized revenue.
Here is my interview with him:
1. Can you highlight some of the unique aspects of working with healthcare data? What kind of data problems do you look into?
As data scientists, we’re often trying to use data that was collected for a specific purpose (e.g. to produce a claim that is sent to a payer) and using it for another purpose (e.g. to understand the progression of a certain condition or disease). So we really need to understand the subtle effects that can cause issues — there may be operational fields in the data that were not used for a given time period and are now used, etc. So data quality, missing values, etc. are very common. To address these issues, we employ a number of techniques to identify quality issues, resolve them and deal with missing values.
Additionally, and this is common in healthcare financial data — we need to group together value in raw data to make it more actionable for predictive modeling. For example, annotating that both ibuprofen and acetaminophen, although separate medications, can be considered “painkillers”.
2. Can you explain the term “Charge Integrity Analysis”? How does this benefit revenue management operations?
Sure. “Charge Integrity Analysis” refers to the process that we’ve developed to analyze a hospital’s historic charging practices — specifically extracting pattern and trends the exist between chargeable entities, procedure and diagnosis codes, and patient information. Then, on a regular (e.g. daily) basis, we leverage those patterns and trends to identify likely situations where a given charge is missing on a claim. This benefits the hospital RCM team since they now know those charges that are statistically very likely to be missing. After verifying documentation for the missing item, the charge can then be re-billed resulting in improved revenue collections.
3. Can you explain how an Agile software methodology works for the novice? Can you share a client experience that best exemplifies this process?
When building a piece of software, typically the requirements for the software evolve over time. This is understandable as stakeholders start to get their hands on early versions of a software package, they start using it, understanding what can be done and refining their notions of how it would best benefit users. The Agile process was developed so that stakeholders get their hands on early versions of a piece of software and can refine their requirements while minimizing the resulting disruption to the engineers who are building or coding the software. So it’s a very iterative process of getting a system to the point where a stakeholder can start to work with it, refining the requirements of what it does, and repeating until the system is signed-off by stakeholders and ready to go.
Historically, this process produces functioning software systems that meet stakeholder requirements faster than a number of other development methodologies.
We’ve employed the same practice when working with a client to fully leverage predictive modeling and analytics across their organization. We’ve used the Agile process to have frequent interaction and feedback from stakeholders on data quality, analytic and reporting needs and predictive modeling integration. This has resulted in delivering systems to clients where the functionality of the system matches client expectation.
4. Can you discuss the importance of data governance and validation in the pre-modeling process? How is bridging “disparate” data for an organization a critical stage for predictive modeling?
A common saying in data science is “garbage-in-equals-garbage-out”. Data governance and validation are the primary ways to avoid this situation. And, to optimize the information content in a dataset for predictive modeling, we need to pull together all data elements that may be related to a given outcome or event of interest (e.g. Identifying the root case for a denial on a claim). It is a fact of the current state of data systems in hospital setting that all of the information possibly related to a patient’s visit reside in different systems — since they were originally constructed for various different purposes. To maximize the information content of the data, we need to pull data at the patient or patient-visit level together across different systems to get the full view of what was done to the patient when and where. To ensure that the data accurately captures the who, what, when, and where aspects we need to be sure that data elements coming from different systems logically make sense and rationalize when they are brought together.
5. How do you explain the “black box” of predictive modeling to non-analytic clients? What analogies and examples do you use?
While predictive modeling can be viewed as a black box where data is fed in and somehow, predictions come out at the end; I don’t have this view. It is true that data is used to extract patterns and trends and these patterns and trends are used to make prediction. While these patterns and trends can be complex, we like to provide examples (based on a given client’s data) that can be expressed as rules — e.g. If the patient has insurance company X, and has lab charges above a certain threshold and has a given ICD-9 diagnosis code, this results in a high denial rate. We can then go right to the source data to easily verify with the client all of the cases that satisfy the rule and show them their denial rate over these cases.
6. How important is it in creating an analytic culture for your clients to drive substantial and sustainable value? Can you share a success story?
Our clients who have created an analytic culture tend to see very good ROI values in their analytic and work driver investment. When a client focuses on data integrity, initially planning and regularly discussing analytics and reporting needs they typically see ROI of 4 or more to 1.
7. From your experience, how has healthcare reform impacted revenue cycle management? Does increased patient responsibility for healthcare payments help or hinder? Can analytics help this new process? Can you explain why a “self-pay” prediction model can help?
As is common in the industry, more and more of the financial responsibility for healthcare is falling on the shoulders of the patient. This is a great example of an area ripe for a predictive model to help out. In the case of self-pay, knowing the likelihood that a patient will pay the portion of their healthcare bill for which they are responsible allows providers to be smarter about how they handle these patients. Rather than turning them over to collections, if you know that a group of patients are very likely to pay (based on current data and past payment history), then it is better to wait awhile longer for their payment vs giving a slice of it to a collection agency.
8. How do you position analytics from being a cost center to a value center? Can you provide a client success story?
By coupling analytics and predictive modeling with the best actions to take to increase reimbursement, analytic and modeling investments allow the provider to recoup the reimbursement for all of the services they’re providing to a patient. Clients who are seeing ROI of 4 or more to 1 know that their analytic investment is not a “cost center”.
9. Through the increasing use of big data, how do you see the healthcare industry evolve over the next few years?
Absolutely. Healthcare is definitely a “big data” field — personal devices are producing health data every few seconds, EHRs are storing the details of historic interactions that providers are having with patients, etc. Bring in images, etc. and that’s a lot of data. The challenge is to filter through these huge data stores to identify care patterns and treatments that optimize outcomes while controlling costs.
The article was originally posted here.