A Framework for Defining the Scope of Data Governance Strategy Projects (Part 2)
Thou shalt not "boil the ocean."
This is Part 2 of this series. Part 1 is here.
Introduction
The simple model presented in A Framework for Defining the Scope of Data Governance Strategy Projects (Part 1) is intended as a general framework for thinking about data governance when an organization wants to improve how data are used to inform and solve problems. To do this the organization may need a practical data governance strategy to guide and support its actions.
One approach to developing a data governance strategy is traditional and straightforward:
Decide specifically what problem (“application area”) you want to solve.
Identify and understand the data, systems, and processes that need to work together to provide the needed data analytics to solve this problem.
Create and implement a project plan based on a rational consideration of possible alternative approaches to solving the problem.
Use what is learned from (3) to adapt and expand the strategy to other problem areas.
The circles in the simple model described in A Framework for Defining the Scope of Data Governance Strategy Projects (Part 1) represent the types of information (including metadata, or data about data) that will be needed to effectively plan a data governance strategy. Categories include information about relevant raw Data, Systems, Processes, Users, Costs, and Benefits.
Concrete steps
We want our data governance strategy to define a set of concrete steps that will take us from a “current state” to a desired “future state” that helps address the defined problem or application area. Of course, we’ll need to define and understand both the current and the future state. Sources for this understanding are usually people in the organization as well as system and process documentation including data dictionaries, network diagrams, and data models.
A "traditional" approach
Management will be familiar with this traditional outline of a strategy or planning exercise. That familiarity can be used to our advantage in gaining cooperation -- as long as we don’t attempt too much initially and focus instead on a well-defined and recognized problem. The perception of “paralysis by analysis” must be avoided, hence the need to focus on definable problems and rapid delivery.
Relevant planning information
In our “data governance strategy project” we will want to address the following questions and topics:
Describe the problem or application area we’ll be addressing. Be as specific as possible about why this is a problem worth attacking. Also describe how we will measure how well we're doing in solving the problem (“metrics”). Recommendation: this problem area or “pain point” should be an important one where uncertainty exists that, it is felt, might be reduced were better data available to describe or solve the problem.
Describe the data that will be needed to quantify and solve the selected problem. Consider data of all types, not just well controlled data that are currently collected and organized in existing systems and databases. Be sure to think broadly about data in terms of human-human, human-machine, and machine-machine communication. Try at this stage to talk about classes and groups of data, not at the level of individual data field; there will be time enough for that later should the value of an organized and structured metadata repository be recognized.
Given the data identified in (1), what systems are currently used to gather, organize, manage, interact with, and analyze the data? Consider both hardware and software, including remote or cloud based systems. Remember to focus on data relevant to the problem being addressed, not the organization's entire data population; we don’t want to run aground by trying to “boil the ocean.”
What are the business processes that touch on the data we've been discussing? We need to understand both the activities associated with technically maintaining the data (for example, as worked on by the IT department) as well as work done by operating departments in creating or using data such as Finance, Marketing, HR, Manufacturing Customer Service, R&D, and/or Sales.
Let's also talk about the users of the data, the people who actually interact with the data, manipulate it, manage it, analyze it, and make decisions using it. Who are they? Where are they located? What problems or issues do they have with how the organization’s data resources are currently organized, managed, or communicated? What do THEY think should be done to improve data governance?
What are - - or will be - - the costs associated with managing and using the data, systems, and processes we've been discussing? We need to consider a variety of costs since we will might be considering a variety of approaches to changing how we manage and use the data under consideration. These can include:
current state versus future state
personnel, hardware, software
data cleanup or conversion
business process changes (these are often underestimated)
ongoing maintenance and support
training
When all is said and done, we're improving our data governance processes because we want to improve things. We have to think about benefits and ROI (Return On Investment). It's going to cost something to change what we're doing now (see number 6 above). We want to make sure the improved outcomes (see number 1 above) at some point exceed the cost, even if we are “starting small” in order to produce something useful quickly. This may sound obvious - - benefits should exceed costs - - but it isn't always that simple in the real world, especially when data need to be changed in some way. In the real world, we have to deal with realities like front end data cleanup, inflexible software licensing, a lack of staff knowledgeable about “data science,” and resistance by some staff to a more “digital” or data driven organization. This is why it’s important to be very focused at the start in order to control the scope of data, analytics, and data governance related efforts.
Importance of communication
Thinking about data and data governance more comprehensively benefits through use of a framework such as described. Not only does such a framework help to define and control strategy project scope and costs, it also helps when communicating with management and staff.
The value of being able to communicate with management about data governance should go without saying. Just as important is the ability to communicate with “the troops” about what’s being done. You can't expect everyone in an organization to understand the finer details of data management or predictive analytics. Making everyone aware of where he or she stands in relation to the data that courses through the organization will, however, eventually pay off. Improvements in “digital literacy” will be valuable should it become necessary to change the organization's business processes.
Importance of transparency
To the extent possible, a data governance strategy project can't be conducted “in secret.” Some level of transparency must be maintained as long as corporate data security and personal data privacy are protected. Especially in organizations that rely on legacy systems and legacy processes to get things done, data “ownership” issues may arise as a data governance strategy evolves. Overcoming such resistance will be another reason to focus on creating solutions to specific problems.
Copyright (c) 2017 by Dennis D. McDonald