On Managing Health Data Programs: Some Thoughts After the Health Datapalooza Conference

June 15, 2016 Dennis D. McDonald

Click or tap here to download a .pdf of this article.

INTRODUCTION

On behalf of Edaptive Systems of Owings Mills, Maryland I attended the recent Health Datapalooza conference in Washington DC. This is how the conference organizers described it:

Health Datapalooza is a national conference focused on liberating health data, and bringing together the companies, startups, academics, government agencies, and individuals with the newest and most innovative and effective uses of health data to improve patient outcomes.

I had been consulting on data governance for one of Edaptive’s Centers for Medicare and Medicaid Services (CMS) contracts and, as an extension of my own research and consulting on big data project planning and management, I wanted to improve my understanding of how data governance and program management practices impact how medical and health data are used.

In this paper I discuss some of my take-always from the Health Datapalooza conference including:

Cultural and institutional barriers exist to making health data more accessible and useful.
Making health data “open and accessible” is not an end in itself.
Effective personal control over health data will be challenging for a long time.
Anti-government sentiments miss the point.
Combining structured and unstructured data offers both challenges and opportunities.

1. Overcoming cultural and institutional barriers

The conference was an effort to bring people from different sectors together to discuss and share information about data-based innovation and its challenges.

An important thing about digital data is the ease with which it can be transferred across borders of all kinds – institutional, geographic, organizational, and physical. Sometimes these borders are pretty porous. At other times, “cultural” resistance to transferring and using data in new or different ways can be quite strong.

This was pointed out by several speakers throughout the conference who lamented that “culture” sometimes acts an ingrained “business as usual” rationale for resisting change.

This reference to “culture” as a barrier to change is nothing new. I’ve heard “culture” referred to like this before:

Organizations resisting use of social media for engaging with the public.
Managers resisting use of collaborative technologies to share knowledge and expertise across departmental boundaries
Executives resisting making institutional data more open and accessible to the public.

“It’s the culture,” some technology promoters will say, throwing up their hands in despair when they hit an institutional roadblock.

Hearing culture raised at this health data conference as a reason why things take so long came as no surprise. Folks who are especially tech-oriented, for example, may not always appreciate why institutional resistance occurs; this is why tech adoption debates sometimes get framed as “us versus them” with the “them” portrayed as technological Luddites.

Nevertheless, being innovative with data, especially data within such institutionally segregated sectors as healthcare and medicine, can call into question the reasons for sustaining traditional institutional distinctions. For example, if making medical registry data more standardized and accessible to other institutions makes data more useful, isn’t that worth promoting — even when it ruffles institutional feathers that prefer controlled siloes to more open access?

At minimum, when we’re looking at making data more accessible and useful, especially when the data are currently “owned” or managed by different organizations, we may need to consider different project and program management techniques than might have been used in the past. These techniques must take into account the organizational and process change issues that will be raised by bringing data together from different sources to support improved analytical or product development applications.

Sometimes we’ll have to work with existing systems and institutions, sometimes we’ll have to be disruptive, and sometimes we’ll have to do both. I’ve concluded that a “one size fits all” approach to managing innovative health data programs and projects is neither possible nor desirable.

2. Making health data “open” is not enough

Having some familiarity with the “open data” industry, I’m wary of projects and programs that focus too narrowly on technology or that don’t pay enough attention to making sure that how data are made open and accessible actually supports worthy goals and objectives.

I’m not talking about a politically-defined term like “transparency” but rather a very basic requirement of any well-designed information system: benefits flow from how a system is used. If the system is designed to support a program or policy, the system – and its data — should contribute to the accomplishment of that program or policy.

In other words, I’m not a supporter of the “let’s toss this data over the fence and see how it gets used” philosophy. If hard earned tax dollars have been spent on generating data in connection with a particular program or project, we also need to understand how spending money on making that data accessible actually supports that program or project.

I’m not saying that ancillary or unanticipated uses of data won’t have any benefits; far from it. Serendipity should always be encouraged. I’m just saying, first things first. Data generated by a government program is not just a “byproduct” with unknown potential uses, it should always be analyzed in light of why it was generated in the first place.

Theoretically, data related to healthcare programs should have an advantage in terms of being tied back to the goals of the program that generated the data. For example, if the data are clinical or treatment related, we should ask how analyzing it in creative or innovative ways might improve treatment outcomes. Also, if the data are associated with public, private, or nonprofit insurance systems, we should hope that imaginative use of program data will actually help to make health care delivery more efficient.

3. Effective personal control over health data

Another conference theme was the trend to making personal health data more accessible to individual patients. People are being encouraged to demand their medical records from their providers, systems are being developed to simplify the packaging and downloading of medical records, and data governance processes are under development to make it easier to share information among, say, the different institutions involved in an individual’s medical care.

I’ve long been a believer that an individual owns his or her own data. I addressed this topic in the very first blog post I published back in 2005. Problem is – and this was noted by several speakers at the conference – medical records are not easy to analyze or understand. Sharing them digitally across institutional boundaries may require the ability to decipher the unique language that might be used by one institution for a particular condition or treatment.

Also, making medical records available for download by an individual patient doesn’t necessarily mean that the individual will actually demand his or her records, or that the patient will know what to do with the data once it’s delivered.

This “understandability deficit” when it comes to personal health records is a real issue and one that needs to be addressed. It is related to what we have found with open data projects where the key deliverable is a portal supporting file downloading. Some people know what to do with raw data, especially those involved in research of some sort. Members of the “general public” may not. As a result, making data useful may also require development of a layer of analytical support services to help with the personalization and interpretation of data.

As discussed at the conference, several entrepreneurs are developing web and mobile device applications to help personalize individual and group based medical information. The “use cases” for such applications can be a personal medical situation, such as the need to compare and select from a variety of Medicare insurance supplements, a desire to monitor and control one’s own health conditions in relationship to a data-defined set of cohorts, or a need to securely exchange information with other disease victims. This concept of personalizing medical information by using a pool of “big data” to provide a context against which to compare one’s own health situation or treatment was a common theme.

4. Anti-government sentiment misses the point

It’s always popular to disparage “big government” and all its bureaucracy. When you work on government contracts as I have been doing and attend conferences like Health Datapalooza, though, you tend to get a different perspective.

This is especially true when discussing health related data and how it supports everything from support for diagnostics and treatment to insurance management, biomedical research, and pharmaceutical research. Federal and state government agencies are involved in much of the nation’s health care and the volume and variety of data – “big data” in many cases – are vast. The data generated in connection with the myriad of government managed health programs constitutes a massive resource not only for supporting program and performance management but also research.

Another thing to remember is that, when we’re talking about medicine and health care, everyone has a personal health or medical story to tell about themselves or their families. This fact was repeated throughout the Health Datapalooza conference. One impact of this is that the core ideas for innovative new health data applications frequently come from a personal interest, perhaps related to an individual illness, to a complicated interaction with a caregiver or insurance company, or to a realization that money is being wasted because of the way things are currently being done.

Another important question is whether “the government” is doing enough to responsibly and securely make its (or rather “our”) data available for new and innovative data analytics and health related innovations. This doesn’t have to mean that government agencies should take responsibility for developing all new health data-based products. It does mean that changes in how health data are managed should take into account how potential users can find out about, analyze, and obtain secure and reliable access to important health related data.

From a program design standpoint this means government health programs that generate useful data need to incorporate systems and processes not only to makes sure program data are used internally in an intelligent and secure fashion to support planning and management but also to make sure appropriately anonymized data are discoverable by and accessible to innovators, developers, analysts, researchers, and the public.

There are a variety of ways to ensure this is done. Making public data “open by default” is a good start. It is also something that needs to be considered as health related systems are modernized and upgraded, starting with the realization that current departmentalized and siloed data intensive operations need to be managed in a more collaborative and coordinated way.

5. Combining structured and unstructured data

It’s in this area, managing unstructured health related data, that challenges were identified throughout the conference. Despite an encouraging “cognitive health” video about IBM’s Watson, challenges of indexing and updating medical registries in the real (i.e., messy) world were discussed throughout the conference.

Syntax and vocabulary vary from institution to institution and specialty to specialty. Applying even traditional models of structured vocabularies and coding schemes in the generation of metadata can be manually intensive and expensive. Potentially significant – and potentially deadly — implications can arise when key data are incorrectly recorded or where terminology is not standardized.

Given the challenges associated with managing unstructured data, combining unstructured and structured health related data may be as much of a management challenge as a technical challenge.

If we hypothesize creation of a clearinghouse or analytical operation charged with supporting the analysis of both structured and unstructured data in support of a particular application area, we might want to start by defining the scope of our application area and the problem we are trying to solve. From that we determine the different institutions that have a stake in this particular problem area and then the different communities or schools of thought that exist.

Each of these communities may be attacking a different portion of the problem. At their borders, they may experience translation and communication issues that occasionally stymie collaboration. Yet, it may be possible to identify key decision models or rules that can be captured in a consistent way that could then be documented publicly and used as the basis for building organized collections of structured and unstructured data and the processes necessary for tracking, replenishing, and analyzing the data.

How would such a data-focused organization “fit” into the existing scheme of government, private sector, academic, nonprofit, and professional organizations with a “stake” in the problem area? How would it be funded? Who would “own” and manage the analytical processes it supports?

These are all good questions that extend beyond technology and analytical methodologies.

ORGANIZATIONAL STRUCTURES AND GOVERNANCE

One organizational model to follow might be the National Cancer Institute’s Genetic Data Commons, as I suggested in What Kind of Management Structure Is Needed to Govern a Data Analytics Program?

The management of data intensive programs, especially those where the intent is to support collaboration and innovation, may itself require innovative and potentially disruptive approaches to governance and administration. Still, the design related questions that need to be addressed are familiar:

There are a variety of models the organization can follow with an initial design being driven by which users and uses cases will be targeted, how much support (and of what kind) will need to be provided, how the data to be analyzed will be gathered and managed, and the extent to which the organization’s processes and deliverables will be open or shared.