Garbage In, Garbage Out – The Impact of Data Quality on Data Insights

Author's avatar Marketing UPDATED Apr 14, 2023 PUBLISHED Mar 9, 2018 7 minutes read

Table of contents

    Peter Caputa

    To see what Databox can do for you, including how it helps you track and visualize your performance data in real-time, check out our home page. Click here.

    We have some spectacular new tools for gathering, analyzing and visualizing sales/marketing data these days.

    Most of us assume that all data points are created equal. All we have to do is crunch them through algorithms and extract results into data visualization displays to gain new meaning and to take action.

    Or maybe, on the cutting edge, we feed our data into machine learning engines and marvel at the insights and recommendations that the machine spits out.  Big mistake! Let’s take a look at some of the pitfalls involving data quality and rethink the process a bit, shall we?

    Errors and Redundancy

    At a fundamental level, we want to make sure that each data point is unique and accurate. When it comes to our contact or lead database, this data is often compromised, for example:

    • Invalid or false entries in data fields, like name, email, addresses, company, etc.
    • Incomplete records without a key index, like collection timestamp or source
    • Duplicate records of the same entity (person, company) recorded multiple times or from different sources
    • Garbled data, especially in text fields where registrants misspelled their answers
    • Imported data which is not structured the same way as our database and equivalent fields are not really equivalent, for example, job titles that are not consistent with our form answers

    Impact: False or misleading data cause us to mis-identify and improperly quantify KPIs.  For example, you can’t calculate lead score based on job title from a database full of inconsistent job titles. Errors in the data would be particularly hard on machine learning algorithms that are forced to consider bad data in their pattern matching routines – i.e. how to identify bad records and disqualify them.

    Solutions: First, take the time to assess data quality using spreadsheets or other tools. If possible, use workflows or spreadsheet filters to categorize and update systematic errors. You may be able to use data validation and enhancement services, like email validation services, to identify and delete duplicate records or correct erroneous records. Manual manipulations, like identifying and deleting bogus records are often required – unless you can teach your AI app to do that for you!

    google-analytics-kpi-dashboard-template-databox

    Sampling and Statistical Significance

    Many of us are collecting raw sales and marketing data from Google Analytics and social media sites, where we have no control over the collection and processing of the data.

    We do have at least some control over what gets “exposed” to these applications, however. We can control what pages of our website get indexed by Google, and we should be exercising tight control over both company-produced and user-produced content on social networks to make sure that they are in line with company policy and publication guidelines. Where we get into trouble is when those controls are not in place, for example:

    • Pages that we don’t want the public to see are indexed by Google, like draft pages or outdated pages
    • Pages that are indexed by Google no longer exist (404s) or have significant warnings based on outmoded SEO practices
    • We don’t learn how to take full advantage of the power of Google Analytics and Google Search Console – i.e. we don’t connect to the right analytics in our reports and dashboards
    • We don’t have enough traffic, clicks, conversions and other metrics to reach conclusions. For example, a recently launched website that attracts twenty visitors one week and twenty five the next. That’s a pretty big increase in traffic, right?

    Impact: You always want to be on the lookout for errors of omission, like SEO problems or incomplete setup, or sampling errors due to low volume. They can lead to bad assumptions and inaccurate reporting outcome in your reporting software – which can lead to somebody getting fired!

    Solutions: Don’t be in such a hurry to launch that new website or social media page until it’s completely buttoned up. Here’s an up-to-date checklist to help you through that process. Make sure you understand how to set up Google Analytics and Google Search Console. Pay close attention to what you want published and tracked and what you want to be (or remain) hidden. Do this right, and you can harvest a wealth of information, including visitor engagement and more.

    Categorization and Bias Errors

    At the next level of data analysis, we want to look at metadata, i.e. how individual data fit into unique categories and look at how changes in these “buckets” indicate performance, for example:

    • First touch (or later) attribution sources like organic search, social media, paid media campaigns, email or events
    • Lead lifecycle stages like visitor, lead, MQL or SQL
    • Sales lifecycle stages like SQL, SAL, Opportunity, Deal Pending or Customer
    • Sales Qualification criteria like Interested, High Interest, Contacted, Qualified, Unqualified
    • Deal stage criteria like Open, In-Progress, Preliminary Approval, Approved, Closed-Won and Closed-Lost
    • Customer service lifecycle stages like Customer, Renewal Opportunity, Repeat Customer or Partner
    • ABM Criteria like Account Name, Role, Influence Rank, Primary Contact, Engagement Score
    • Regional/National Bias – how much of your data comes from the place(s) you want to attract, and is the rest meaningless?
    • Role/Occupation Bias – how much of your data comes from students when you’re trying to attract CEOs?
    • Temporal Bias – remember when you were really pushing hard on marketing and getting all that new traffic and lead data, five years ago? Now you’re comparing today’s relatively small volume, with a brand new targeting strategy, with that old stuff. Hmmm.

    Impact: If you get these categories wrong, or change them mid-stream, the impact can be devastating. Imagine a database of one million contacts that are mis-categorized or at least inconsistently categorized. How well are your targeted, segmented email and ad campaigns going to work? Your sales team may be reaching out at inappropriate times with irrelevant content because their leads were filed in the wrong “box”. Your reports and dashboards are all off as well. Let’s see… Last year we had an average of 200 SQLs per month, and this year we have 50 SQLs.

    What happened? You’ll know you have a problem when you show off your beautiful new dashboards to Management, and they wonder out loud why they hired you.

    Solutions: This one is difficult to overcome, so by all means, think about this before you implement a new sales process, CRM and marketing automation system. Let’s start with process:

    • Consistent Definitions – make sure everyone’s on the same page about how data gets categorized. What constitutes an SQL, for example, and what are the exceptions, if any? Even more complex, how do we define and set up lead scoring that really helps the sales team rapidly identify qualified leads?
    • Consistent Assignment – develop automation workflows using agreed-upon criteria to assign or update contacts and leads and to update lead scoring
    • Consistent Best Practices – the entire team – Sales, Marketing, and Customer Service – needs to know when it’s appropriate to update a contact, company or deal record and how to do it the right way. We’re looking for 100% equivalency across all records here.
    • Avoiding Bias – well, assuming the goal is transparency and an honest assessment of results, develop workflows and lists to filter out undesired characteristics. If you don’t care about leads from outside the U.S., filter them out to see how you’re really doing on your domestic goals. If you don’t sell to students, filter them out. If the old data from five years ago is irrelevant to what we’re doing today, filter it out.

    Yes, sometimes our data collection and data management practices need a complete makeover. In that case, you may need to re-imagine the end game.

    What is it that you are really trying to accomplish in sales, marketing and service? It may be necessary to scrap the old database and start over, but hopefully not. In any case, be prepared to roll up your sleeves. Dive into the data and give it an honest appraisal or outsource to someone with expertise doing it. After you get your data-quality act together, you’ll then need to commit to a strategy and processes that maintains and enforces it.

    The impact? Error-free operations leads to more accurate analytics, more predictable revenue streams and increased profitability.

    google-analytics-kpi-dashboard-template-databox

    If you’re wondering how to get started putting together the tools, the data and the process, contact The MarTech Whisperer for a free consultation.

    Author's avatar
    Article by
    Tamara Omerovic

    Tamara Omerovic is a marketing consultant with 8 years of experience in content marketing and SEO. She specializes in leading high-performing teams and developing effective marketing processes for B2B SaaS brands.

    More from this author

    Get practical strategies that drive consistent growth

    Read some