Garbage In, Garbage Out - The Impact of Data Quality on Data Insights

We have some spectacular new tools for gathering, analyzing and visualizing sales/marketing data these days.

Most of us assume that all data points are created equal. All we have to do is crunch them through algorithms and extract results into data visualization displays to gain new meaning and to take action.

Or maybe, on the cutting edge, we feed our data into machine learning engines and marvel at the insights and recommendations that the machine spits out. Big mistake! Let’s take a look at some of the pitfalls involving data quality and rethink the process a bit, shall we?

Errors and Redundancy

At a fundamental level, we want to make sure that each data point is unique and accurate. When it comes to our contact or lead database, this data is often compromised, for example:

Invalid or false entries in data fields, like name, email, addresses, company, etc.
Incomplete records without a key index, like collection timestamp or source
Duplicate records of the same entity (person, company) recorded multiple times or from different sources
Garbled data, especially in text fields where registrants misspelled their answers
Imported data which is not structured the same way as our database and equivalent fields are not really equivalent, for example, job titles that are not consistent with our form answers

Impact: False or misleading data cause us to mis-identify and improperly quantify KPIs. For example, you can’t calculate lead score based on job title from a database full of inconsistent job titles. Errors in the data would be particularly hard on machine learning algorithms that are forced to consider bad data in their pattern matching routines – i.e. how to identify bad records and disqualify them.

Solutions: First, take the time to assess data quality using spreadsheets or other tools. If possible, use workflows or spreadsheet filters to categorize and update systematic errors. You may be able to use data validation and enhancement services, like email validation services, to identify and delete duplicate records or correct erroneous records. Manual manipulations, like identifying and deleting bogus records are often required – unless you can teach your AI app to do that for you!

Sampling and Statistical Significance

Many of us are collecting raw sales and marketing data from Google Analytics and social media sites, where we have no control over the collection and processing of the data.

We do have at least some control over what gets “exposed” to these applications, however. We can control what pages of our website get indexed by Google, and we should be exercising tight control over both company-produced and user-produced content on social networks to make sure that they are in line with company policy and publication guidelines. Where we get into trouble is when those controls are not in place, for example:

Pages that we don’t want the public to see are indexed by Google, like draft pages or outdated pages
Pages that are indexed by Google no longer exist (404s) or have significant warnings based on outmoded SEO practices
We don’t learn how to take full advantage of the power of Google Analytics and Google Search Console – i.e. we don’t connect to the right analytics in our reports and dashboards
We don’t have enough traffic, clicks, conversions and other metrics to reach conclusions. For example, a recently launched website that attracts twenty visitors one week and twenty five the next. That’s a pretty big increase in traffic, right?

Impact: You always want to be on the lookout for errors of omission, like SEO problems or incomplete setup, or sampling errors due to low volume. They can lead to bad assumptions and inaccurate reporting outcome in your reporting software – which can lead to somebody getting fired!

Solutions: Don’t be in such a hurry to launch that new website or social media page until it’s completely buttoned up. Here’s an up-to-date checklist to help you through that process. Make sure you understand how to set up Google Analytics and Google Search Console. Pay close attention to what you want published and tracked and what you want to be (or remain) hidden. Do this right, and you can harvest a wealth of information, including visitor engagement and more.

PRO TIP: How Are Users Engaging on My Site? Which Content Drives the Most Online Activity?

If you want to discover how visitors engage with your website, and which content drives the most engagement and conversions, there are several on-page events and metrics you can track from Google Analytics 4 that will get you started:

Sessions by channel. Which channels are driving the most traffic to your website?
Average session duration. How long do visitors spend on your website on average?
Pageviews and pageviews by page. Which pages on your website are viewed the most?
Total number of users. How many users engaged with your website?
Engagement rate. Which percentage of your website visitors have interacted with a piece of content and spent a significant amount of time on the site?
Sessions conversion rate. How many of your website visitors have completed the desired or expected action(s) and what percentage of them completed the goals you’ve set in Google Analytics 4?

And more…

Now you can benefit from the experience of our Google Analytics 4 experts, who have put together a plug-and-play Databox template showing the most important KPIs for monitoring visitor engagement on your website. It’s simple to implement and start using as a standalone dashboard or in marketing reports!

You can easily set it up in just a few clicks – no coding required.

To set up the dashboard, follow these 3 simple steps:

Step 1: Get the template

Step 2: Connect your Google Analytics account with Databox.

Step 3: Watch your dashboard populate in seconds.

Try this template

Categorization and Bias Errors

At the next level of data analysis, we want to look at metadata, i.e. how individual data fit into unique categories and look at how changes in these “buckets” indicate performance, for example:

First touch (or later) attribution sources like organic search, social media, paid media campaigns, email or events
Lead lifecycle stages like visitor, lead, MQL or SQL
Sales lifecycle stages like SQL, SAL, Opportunity, Deal Pending or Customer
Sales Qualification criteria like Interested, High Interest, Contacted, Qualified, Unqualified
Deal stage criteria like Open, In-Progress, Preliminary Approval, Approved, Closed-Won and Closed-Lost
Customer service lifecycle stages like Customer, Renewal Opportunity, Repeat Customer or Partner
ABM Criteria like Account Name, Role, Influence Rank, Primary Contact, Engagement Score
Regional/National Bias – how much of your data comes from the place(s) you want to attract, and is the rest meaningless?
Role/Occupation Bias – how much of your data comes from students when you’re trying to attract CEOs?
Temporal Bias – remember when you were really pushing hard on marketing and getting all that new traffic and lead data, five years ago? Now you’re comparing today’s relatively small volume, with a brand new targeting strategy, with that old stuff. Hmmm.

Impact: If you get these categories wrong, or change them mid-stream, the impact can be devastating. Imagine a database of one million contacts that are mis-categorized or at least inconsistently categorized. How well are your targeted, segmented email and ad campaigns going to work? Your sales team may be reaching out at inappropriate times with irrelevant content because their leads were filed in the wrong “box”. Your reports and dashboards are all off as well. Let’s see… Last year we had an average of 200 SQLs per month, and this year we have 50 SQLs.

What happened? You’ll know you have a problem when you show off your beautiful new dashboards to Management, and they wonder out loud why they hired you.

Solutions: This one is difficult to overcome, so by all means, think about this before you implement a new sales process, CRM and marketing automation system. Let’s start with process:

Consistent Definitions – make sure everyone’s on the same page about how data gets categorized. What constitutes an SQL, for example, and what are the exceptions, if any? Even more complex, how do we define and set up lead scoring that really helps the sales team rapidly identify qualified leads?
Consistent Assignment – develop automation workflows using agreed-upon criteria to assign or update contacts and leads and to update lead scoring
Consistent Best Practices – the entire team – Sales, Marketing, and Customer Service – needs to know when it’s appropriate to update a contact, company or deal record and how to do it the right way. We’re looking for 100% equivalency across all records here.
Avoiding Bias – well, assuming the goal is transparency and an honest assessment of results, develop workflows and lists to filter out undesired characteristics. If you don’t care about leads from outside the U.S., filter them out to see how you’re really doing on your domestic goals. If you don’t sell to students, filter them out. If the old data from five years ago is irrelevant to what we’re doing today, filter it out.

Yes, sometimes our data collection and data management practices need a complete makeover. In that case, you may need to re-imagine the end game.

What is it that you are really trying to accomplish in sales, marketing and service? It may be necessary to scrap the old database and start over, but hopefully not. In any case, be prepared to roll up your sleeves. Dive into the data and give it an honest appraisal or outsource to someone with expertise doing it. After you get your data-quality act together, you’ll then need to commit to a strategy and processes that maintains and enforces it.

The impact? Error-free operations leads to more accurate analytics, more predictable revenue streams and increased profitability.

If you’re wondering how to get started putting together the tools, the data and the process, contact The MarTech Whisperer for a free consultation.

SaaS

Agencies & Consultants

Ecommerce

Business Analysts

Executives

Functional Leaders

Team Contributors

Agencies & Consultants

How Evenbound Streamlined Reporting & Boosted Client Results

How DispatchTrack Built a Data-Driven Culture

How Conair Unified Data Across Five Brands

Learn

Templates & Tools

Support & Services

Discover

Partner

Garbage In, Garbage Out – The Impact of Data Quality on Data Insights

Table of contents

We have some spectacular new tools for gathering, analyzing and visualizing sales/marketing data these days.

Errors and Redundancy

Sampling and Statistical Significance

PRO TIP: How Are Users Engaging on My Site? Which Content Drives the Most Online Activity?

Categorization and Bias Errors

Is your data stuck in tools not everyone can access?

Related categories

Related Reading

40+ B2B Marketing Statistics to Know When Building Strategy for 2025

HubSpot Reporting Dashboards Used by Revenue Experts: Real Templates, Pro Tips

Impact of Generative AI & Changing Search Habits in the SEO Industry

Make better decisions,
together, faster

SaaS

Agencies & Consultants

Ecommerce

Business Analysts

Executives

Functional Leaders

Team Contributors

Agencies & Consultants

How Evenbound Streamlined Reporting & Boosted Client Results

How DispatchTrack Built a Data-Driven Culture

How Conair Unified Data Across Five Brands

Learn

Templates & Tools

Support & Services

Discover

Partner

Garbage In, Garbage Out – The Impact of Data Quality on Data Insights

Table of contents

We have some spectacular new tools for gathering, analyzing and visualizing sales/marketing data these days.

Errors and Redundancy

Sampling and Statistical Significance

PRO TIP: How Are Users Engaging on My Site? Which Content Drives the Most Online Activity?

Categorization and Bias Errors

Is your data stuck in tools not everyone can access?

Related categories

Related Reading

40+ B2B Marketing Statistics to Know When Building Strategy for 2025

HubSpot Reporting Dashboards Used by Revenue Experts: Real Templates, Pro Tips

Impact of Generative AI & Changing Search Habits in the SEO Industry

Subscribe to our newsletter, Move the Needle

Make better decisions, together, faster

Make better decisions,
together, faster