Your Privacy

This site uses cookies to enhance your browsing experience and deliver personalized content. By continuing to use this site, you consent to our use of cookies.
COOKIE POLICY

Skip to main content

Developing a Common Dataset: A Necessary Challenge for Associations

Developing a Common Dataset: A Necessary Challenge for Associations 
Back to insights

At UDig we have the pleasure of working closely with an association in the healthcare space. Like many associations, their small staff works hard for a cause they passionately believe in. Perhaps none work harder than their Data QA and Analytics team who tirelessly and repetitively fix data elements coming from their member programs to form a cohesive view of their world. Day in and day out the data team works diligently to compare “apples to apples”, despite being handed a gigantic “bowl of fruit.”

They are a visionary organization, however, and set about trying to change their paradigm. Knowing that they needed to simultaneously reduce the effort to compare data elements, while enriching the dataset to allow for more robust measurements, the team began by working closely with their members, Electronic Health Records Vendors, State oversight representatives and other stakeholders to develop a common dataset. A single language they could all speak, that would have not only functional impacts (i.e.- less time spend crunching data), but be able to provide crucial insight into their offerings and translate that knowledge into improved outcomes for patients and providers alike.

Anyone who has ever developed a common dataset (or even a data dictionary for a single entity for that matter) is probably thinking “that’s way easier said than done!”. You would be correct. Our client’s journey began five years ago, and the road ahead remains rocky, covered in fog, and other traveling metaphors that convey the difficulty of shifting to a common dataset. Numerous hurdles face the development of the dataset: from “data fatigue” (as one member referred to the numerous regulatory and other bodies with which they submit data) to shifting definitions of data elements set by governing bodies outside of our client’s control.

Still, if a Common Dataset isn’t attempted now, then when? To quote their CEO “Let’s not let the great be the enemy of the good.” In a perfect world, everyone who touches this dataset would be speaking the same language, and the heroic effort of data analysts could be spent being forward thinking, identifying trends and looking for operational efficiencies that can impact the bottom line. In reality, initial rollout will likely be met with resistance at many different levels; until the value is realized (or at least the vision is shared).

How, then, does an organization even set about developing a common dataset? 

  • Begin with consensus building of the need for the dataset. Identify and quantify the value such an effort would produce.
  • Next, identify the set of crucial data required to make the effort viable. Think carefully about what data is “nice to have” versus an absolute requirement; particularly if excess data will increase the complexity of data cleanup and integration.
  • Now that the set of data has been identified, work to clearly articulate how that data should look. This should take the form of both a business glossary (i.e. English definitions of the data*) and a functional data dictionary (i.e. a set of technical requirements, clearly indicating acceptable parameters, values, etc.).
  • Finally, socialize the dataset and iterate on what was developed.

*Don’t bite off more than you can chew here! In my experience, this is the most difficult phase of any data project: I’ve nearly seen people come to blows over the definitions of a particular attribute of data. Compromise is your friend here and a neutral facilitator can greatly improve the experience for all involved!

Now comes the fun part: the technical implementation of the Common Dataset capture and analytics mechanisms. This could be, of course, the subject of numerous blogs, white papers, late night phone calls and intense hand-wringing sessions. It’s worth noting that this phase will be dramatically simplified based on the strength of the previous steps. Well-defined data is much easier to work with. Working with vendors to automate the data acquisition process will save countless man hours spent “hand jamming” the data, while using the data definitions mentioned above will inform a robust quality process to ensure junk data doesn’t make its way into your analytics.

We’ve now spoken with countless associations who are in much the same place. They know they need to compare apples to apples, but their members are all enjoying dramatically different kinds of fruit. Finding common ground will not only benefit all of their members but provide faster, more meaningful results. Your data isn’t getting any smaller. More and more data is generated every single day. Standardizing a set of data now can reduce manpower, improve data quality, provide insights and optimize processes that effect the bottom line.

 

 

Digging In

  • Data & Analytics

    Unlocking Value: A Practical Playbook for Centralized vs. Federated Data Services

    Enterprise data and technology leaders face a familiar dilemma: how much control should central data teams maintain versus empowering business units with federated access? It’s a debate that’s been heating up as organizations struggle to balance governance with agility, often swinging between extremes that create new problems. As someone who’s guided numerous enterprises through this […]

  • Data & Analytics

    How to Blend Software and Data Engineers on a Single Team | The Jam Session

    Josh Bartels, UDig CTO, joined Wayne Eckerson, Elliott Cordo, and Carlos Bossy, during a recent Insight Jam Session exploring the growing collision between software and data engineering teams as AI reshapes enterprise applications. The group tackled cultural friction, practical solutions, and the future of a unified engineering discipline in an AI-driven world.

  • Data & Analytics

    Ensuring Data Strategy Adoption: The Power of a Test Drive with Blueprinting and Mock Outputs

    Despite years of investment in data platforms and analytics tools, many organizations still face a familiar challenge: their data strategy looks great on paper, but never delivers the value that was expected. Dashboards sit untouched, and self-service portals fail to gain traction. The data team checked every technical box, yet business users continue defaulting to […]

  • Data & Analytics

    Piloting Data Discovery and Governance: The Open-Source Data Catalog

    As organizations grow increasingly data-driven, the ability to quickly discover, understand, and trust internal data becomes more than a convenience—it’s a necessity. Over the past year, I’ve spent more time exploring data catalog solutions and the pivotal role they play in solving a challenge I frequently hear from clients: “We know we have the data, […]