Dennis D. McDonald (ddmcd@ddmcd.com) consults from Alexandria Virginia. His services include writing & research, proposal development, and project management.

So What if Spreadsheets are  the “Lowest Common Denominator” of Data Analysis?

So What if Spreadsheets are the “Lowest Common Denominator” of Data Analysis?

By Dennis D. McDonald

Many years ago when I was writing basic SQL statements to select and analyze a client’s Microsoft Access data I was constantly “bitten” by the lack of simple transferability and equivalence between Excel and Access. Having worked with software developers I understood how different products evolve based on differing underlying data models, but repeatedly I found myself stumbling between underlying Access and Excel incompatibilities and the cumbersome nature of data transfer between the two. Getting data into and out of the two applications was easy. Transferring data back and forth seamlessly in those days was more difficult.

I don’t often get involved in such quandaries these days. I’m more likely to be engaged as a consultant in proposal development and data governence consulting rather than hands-on number-crunching. Still, I was pleased to read a recent NATURE article titled Six tips for better spreadsheets by Jeffrey Perkel. His tips include:

  1. Keep data raw.

  2. Make data machine-readable.

  3. Be consistent.

  4. Document your work.

  5. Cross-check your data.

  6. Think ahead.

These guidelines as explained clearly in the article are excellent. As a consistent user of spreadsheets myself for professional contact tracking as well as financial analysis I can attest to their value.

In an organization that relies on spreadsheets for basic data recording, storage, and analysis, adhering to such guidelines is essential. Failure to do so can lead to sometimes disastrous results. For example, once I was involved as a data governance consultant with one organization where failure across different departments to observe such basic guidelines led to eventual embarrassing reporting difficulties — and inconsistent data — with government oversight agencies.

Standardizing on a given spreadsheet tool or template is not a complete solution to the inevitable data incompatibilities that creep in when different departments use spreadsheets differently. Attention is needed on consistent data translation methods, standardization of data definitions, and even implementation of a “data stewardship” program to oversee implementation of appropriate business processes to smooth the movement of data through the organization.

Consistent data governance processes are especially needed when different departments or functions — often justifiably — approach definitions of data and metadata differently.

Which returns us to the inevitable challenges that emerge when we ask a spreadsheet to do the job that could better be done using a dedicated database management system (DBMS). Regardless of whether the organization’s data are on a local machine, shared on a network, or are cloud based, the DBMS will provide data management capabilities that a standard spreadsheet tool will have difficulty supporting. This becomes readily apparent when the importance of managing consistent metadata (data about the data) increases due to the rising importance of consistent across-departmental analysis, reporting, and visualization.

Still, spreadsheets are never going to go away. They are omnipresent, inexpensive, powerful, and initially at least, very easy to use. The trick is to use them consistently and easily. Following the Nature article’s recommendations is a valuable first step.

Copyright (c) 2022 by Dennis D. McDonald

More on “data governance” and “data program management”

Why I gave up Twitter -- again

Why I gave up Twitter -- again

What do you mean by "the news," anyway?

What do you mean by "the news," anyway?