According to a recent report from Gartner, “U.S. Healthcare Payer CIOs Should Avoid Data Lake Mistakes with Clinical Data Integration,” by Mandi Bishop (October 2019), “Payer CIOs want to know whether data lakes can deliver quick wins for business leaders hungry to derive actionable insights from unstructured and nonstandard clinical data. To avoid drowning in data, CIOs must first specify goals and critically compare internal capabilities to vendor solutions.”

We agree with this assessment and believe that the same holds true for healthcare organizations as well. Following, we outline why the data lake method can no longer hold up in today’s healthcare environment, and offer as an alternative a four-step, interconnected framework that can help you make the most of your data.

1. Start with the End in Mind
The first challenge many organizations come face-to-face with when looking at a data lake is the realization that they don’t have confidence in the quality and completeness of all possible data for all possible use cases or scenarios. It’s important to think about what use cases you will want to focus on at the outset based on what’s most important to your stakeholders (whether that means the C-suite, care providers, or patients). Determine what data you are likely to need and for what purpose:

  • As a payer, are you focused on use cases that support population health or HEDIS measures?
  • As an ACO, are you focused on information that will help you in risk mitigation and Medicare Shared Savings Programs?
  • As a hospital, are you focused on value-based care programs such as improving STAR ratings and on reducing preventable hospital readmissions?
  • As a healthcare system, are you focused on MIPs/MACRA and use cases to improve quality of care?

The best way to set up your data thoughtfully and strategically is to think about your end goals when you first receive the data and work backwards from there.

Data that has been ”loaded” or “integrated” into a data lake provides the illusion of an asset that you can use quickly with a high degree of confidence. Many organizations start with a data lake and assume that “someone else” – a data scientist, perhaps – will be the one sifting through the information later to find what they need for any given use case. This type of postprocessing or late-binding data science is a never-ending cycle of data quality that is both costly and potentially insurmountable given organizational resource constraints.

Click here to read the full article on Healthcare IT Today