on September 2, 2020 (last modified on September 26, 2022) • 56 minute read
Data and marketing. The two go together like peanut butter and jelly.
Why? Clean, complete and accurate data is the key to marketing personalization and marketing automation. And with 50+% of companies doing marketing automation nowadays, we’re all collecting more data about our prospects than ever before in so many different ways:
Unfortunately, in most cases, a sizable percentage of this data we’re collecting will be bad data. This bad data reduces the effectiveness of marketing initiatives, kills sales productivity and maybe even turns off prospects.
In short, bad data wastes time and money.
So now, more than ever, companies are spending time and money to keep their data clean.
To help you eliminate bad CRM data once and for all, we asked 67 marketing and sales operations experts to share how they keep their database clean. Read below to learn more about the types of bad data, the impact bad data can have on your company, how to fix the issues and the software you can use to make and keep a database clean.
To understand the impact that bad data might be having on your business, let’s start by examining exactly what makes data bad.
Source: Sales Hacker
Bad data is incorrect, outdated, duplicate, improperly formatted or just plain missing data in your marketing, sales or customer database– tools like Mailchimp, HubSpot, Salesforce, Intercom, Marketo and Zendesk to name a few.
In short, it’s any data that exists in your marketing and/or sales databases that is incorrect in some way and instead of helping your campaigns or salespeople, it could actually work to hurt them if it was directly implemented in a marketing campaign or used to inform sales interactions.
Before we jump into the types of bad data and how to fix them, let’s talk about how big the problem is.
Bad data can have a huge effect on the ROI of your marketing campaigns and initiatives. As data collection has become a bigger component in all marketing and sales campaigns, a number of recent studies have looked into the real-world dollar impact that bad data can have on a company, and the results of those studies are staggering.
So the conclusion is in — bad data has a serious negative effect on your companies ROI. But how does that actually happen? What are the different types of bad data? What are the ways in which bad data actually impacts your returns, in real-world terms?
To answer these questions, we asked our expert panel a series of questions.
First, we sought to understand how much time is spent cleaning up CRM databases. We asked two questions, including, “How often do you clean up your CRM database?”and “On average, how long does it take to clean up a CRM/marketing database?”Nearly 40% of our respondents spend 4+ hours doing it and almost 60% do it at least once per month. Nearly 30% do it at least weekly! Clearly, a lot of time is spent doing this.
Next, we asked them what functions are needed when cleaning up data. Clearly, there’s no one function needed to keep a database clean. While de-duplication is most prevalent, it’s only one of many functions needed.
We also asked which clean up activities they were doing most frequently. As you can see, most companies are performing multiple clean up activities, with the majority of respondents saying they’re adding missing data, updating wrong data, deduplicating contacts and formatting/ standardizing fields.
As you’ll read below in the contributions from our respondents, usually these fixes require lots of different types of data manipulation, meaning that cleaning CRM data is not a straightforward or simple process.
Next, we sought to figure out how often bad data happens because of data uploads. While that’s obviously not the only reason bad data gets added to a system (and it can also be a way of fixing bad data), data uploads is a common need among our respondents, as can be seen from the bar chart below.
As the chart above shows, companies upload contacts into their CRM for a variety of reasons. The top three reasons were leads captured in other software systems, updating existing contacts and leads from events.
Nearly 45% of respondents are performing these types of uploads daily!
And it’s not just contacts. Companies are often uploading Company lists and Deal records too.
Now that we understand the size of the challenge, let’s talk the specific ways that bad data hurts your marketing and sales effectiveness.
So it’s clear that bad marketing data has a verifiable impact on the marketing ROI of companies. But connecting those issues to the real ways that your campaigns and operation are impacted can be difficult.
Let’s take a look at some of the real-world examples of how bad data directly translates to lower return on your investments in marketing and sales:
Understanding your sales funnel can help you identify where sales prospecting, lead management, and sales processes might need some improvement. . Now you can take control and optimize your sales funnel, but first, you need to monitor the right KPIs:
Now you can benefit from the experience of our HubSpot CRM experts, who have put together a plug-and-play Databox template showing some of the most important metrics for monitoring and growing your sales funnel. It’s simple to implement and start using as a standalone dashboard or in sales reports, and best of all, it’s free!
You can easily set it up in just a few clicks – no coding required.
To set up the dashboard, follow these 3 simple steps:
Step 1: Get the template
Step 2: Connect your HubSpot account with Databox.
Step 3: Watch your dashboard populate in seconds.
Marketing campaigns that use outdated and incorrect data simply won’t be as effective. Customers love it when companies speak directly to them. 78% of customers will only shop with businesses that have had personalized interactions with a company.
Bad data means missed opportunities. A prospect that otherwise might have been interested in your product could be turned away when you get key details about them or their interactions with your brand wrong. Worse, bad data might cause them to feel like your product or service offerings aren’t. Bad data can also harm your targeting, causing you to waste your ad or email marketing budget on the wrong prospects.
When you get critical information about your prospects and customers wrong in your personalized marketing messaging, it will harm their perception of your brand. 87% of consumers say that personally relevant content positively influences their feelings about a brand. When you get that wrong, they really dislike it. 63% of consumers are highly annoyed by brands that use old-fashioned marketing strategies that involve blasting generic messages repeatedly.
You have a limited number of interactions to grab the attention of your audience. A simple mistake can push them away and kill any chances of developing a positive relationship in the future.
When your campaign fails to meet expectations because of bad data, it can be hard to recognize that that is the reason behind it. Many of the effects of bad data remain hidden. When you miss a critical opportunity with a prospect because or erroneous data, they don’t typically email you back to let you know what the problem was.
Bad data can make marketing campaign optimization very difficult. While you’re busy tweaking your messaging, you miss the fact that the campaign wasn’t landing because of data errors. Bad data drags down the conversion rates of campaigns and can cause you to second-guess messaging strategies that otherwise would have connected with your audience.
Marketing data isn’t just used by your marketing team. In HubSpot, Salesforce, Intercom, and other platforms, the same data that your marketing team uses eventually becomes the data that your sales and customer success teams use too. That means that they also inherit all of the bad data stored in those customer records as well.
Your sales team will have a hard time appealing to prospects when they don’t have accurate data on hand. If they do become customers, your customer success team will too.
Ultimately, your marketing data reverberates throughout your organization, hampering your ROI of each department as it makes its way through. Having data quality initiatives in place will help you to limit the effect of bad data on your organization.
When you have a database that is filled with bad marketing data, your sales reps end up having to sift through data before using it in campaigns and messaging. In modern sales, personalization is baked into every interaction that we have with customers. If your sales reps don’t have confidence in that data, they are going to spend more time verifying and checking the data that they use. Leadjen conducted a study and found that reps waste more than 500 hours per year on cleaning data:
Image Source: Neil Patel
That’s hundreds or hours your team could spend focusing on big-picture strategies or executing tactics that generate a more positive return for your company.
There were a few common types of bad data uncovered in our survey. To give you a framework for thinking about cleaning up your database, we categorized data issues in the following ways:
Incorrect Data – Data that’s just wrong like having the wrong phone number or company.
Outdated Data – Data that’s outdated, like having a contact with an email that’s no longer valid or a contact that no longer works at a company.
Duplicate Data – Records that may be duplicate entries. (eg contacts with the same name and address.)
Improperly Formatted Data – Records with formatting errors (eg a name that is not capitalized correctly, phone numbers without spaces, etc)
Missing Data – Records that are missing important information. (eg contacts without first names or phone numbers.)
Low-Quality Data – Contacts you should consider removing from your database. Like emails that start with “@info” or contacts who haven’t opened your emails in 6 months.
Invalid Data. – Records with errors like wrong names with symbols in them or US zip codes with less than 5 digits.
Inconsistent Data – Records with errors like country and/or state names with different formatting or abbreviations.
Let’s walk through the advice from our respondents on how they fix each of these issues.
Data that is simply wrong. Perhaps information was entered into the wrong field, such as a phone number being entered into the “Address” field. Incorrect data is particularly harmful because sending prospects and customers messages with the wrong information in them will actively harm your brand. Your customers will be painfully aware that you’ve used automation in your messaging and it will break the illusion that there is a real person from your company on the other side of the screen actively communicating with them.
Osiris Parikh of Summit Mindfulness sees this as “One of the most common CRM errors that plague marketing and sales personnel.” Says Parikh, “Oftentimes, salespeople will collect faulty information or not take the initiative to verify contact information, ultimately leaving them with useless and obsolete data… [It] causes an amalgamation of consequences, ranging from poor customer service and customer retention, increased costs in erroneous emails and mailings and poor internal productivity.”
“To counter the rampant consequences of poor CRM data,” says Parikh, “firms should dedicate technical verification resources or personnel to verify and update data as it changes. There are several technical solutions that can be implemented to assist in these efforts, which provide seamless and intuitive ways to correct and replace inaccurate or outdated information.
Parikh uses Cloudlead formatting and verification tool and CRM Training to address the issue. “When firms understand the dangers of ineffective CRM data and are equipped with the tools to update and maintain quality content, the risks and consequences associated with poor CRM data are eliminated.”
Sometimes, data is bad by design, created with intentional errors as a way to dissuade contact during the consideration stage. (C’mon… you know you’ve done it at least once!) However, as Jonathan Aufray of Growth Hackers points out, these intentional errors, “bad emails, fake names, wrong numbers. Etc.” must be dealt with. “When your CRM database is messy, you’re tracking the wrong data, you’re sending the wrong information to the wrong people, you don’t know in what part of the funnel your leads are and you will close less leads.”
Though Aufray does use a database cleanup utility to prevent CRM clutter, he says, “The best way to clean up your CRM database is by doing it manually.” However, Aufray recognizes “you don’t want to do it all manually,” and suggests users “create (or find online) scripts that will help you fix your issues automatically.”
Says Brice Germain of Agence Copernic, this poorly formatted data means “You don’t have all the context and information about your prospect. So you’re less efficient. I eliminate all fake emails and bounces.” From there, Germain educates the team to improve their CRM use. To help with cleanup, Germain uses Dropcontact for analysis, segmentation, completion, and double-deletion.
Susan Walsh of The Classification Guru says that issues often appear, such as “names not input correctly, inconsistent formatting such as ‘Full Name’ as well as ‘First Name’ and ‘Last Name’ Columns. and typos.” The problem with these types of errors and omissions, says Walsh, is that they are ”harder to de-duplicate accurately, creating more work and confusion for the end-user.” To remedy these small (but significant) issues, Walsh uses Excel to split/merge the data, and make sure it’s clean and accurate. Using Excel “makes me more accurate and efficient.”
“Garbage in equals garbage out,” says Joseph Cirillo of The Steadfast Coverage Insurance Blog, who says that data corruption happens when users “do not catalog their data correctly. For example, the user makes a sale but doesn’t record it in the CRM, so the client is mislabeled as a prospect instead of a customer.” As a result, “Your sales data is inaccurate… You cannot diagnose problems or opportunities if your data is incorrect.”
Says Cirillo of the solution, “Automation of as much as possible is essential. We have a lead capture system that integrates with our CRM. The CRM processes the sale, so it forces agents to gather all data before moving forward in the sales process, which reduces duplication or ‘busy work’ and forces agents to record the data required. Additionally, the CRM only asks for critical data inputs… If you make data collection long and tedious, people will cut corners or not do it. Cirillo tells us to “Keep it short and straightforward, with the bare minimum of cells required, and recommends using Excel for searching out bad data, as “It helps export and import clean data to the system.”
Bad formatting can also make it hard to segment, says Sarah Mead of SmartBug Media “When we run analytics and want to see how many leads are coming in within a certain demographic, it’s likely this data will be inaccurate and/or difficult to pull in a scenario when we’re not consistent with the information we’re requesting.
Says Mead, “With one client, we went through a situation where we were looking to identify more people in the database who aligned with a specific persona. However, there were a lot of missing job titles and personal information so we were unable to identify who fit this role.”
While the problem could be fixed, it took some work. “We created smart lists within HubSpot that included ‘emails that contained xx’ – this allowed us to look for certain roles and organization types that aligned with this persona.” The team them “Created smart lists within HubSpot that included behavior-based roles. On the website, there were a few blog posts very specific to this persona, so we assigned people who spent time on those blog posts/converted.” then, “We enrolled inactive database members in lead nurturing. During one of the first lead nurture emails we had recipients self select their persona so we were able to assign them to a group.”
Fortunately, this was possible within Hubspot. Mead says, “These tools help to create different queries that are based on certain criteria most relevant to specific form fields/contact data we’re looking to identify.”
One good way to reduce this improper formatting is to limit the choices, says Andrea Moxham of Horseshoe + co. While open text fields are available, they may create “ Inability to segment properly and inaccurate personalization tokens used in marketing.” To combat this, Moxham suggests users “Sort through the types of answers your leads are entering into that open text field. Then, categorize those answers and turn it into a dropdown or checkbox field.”
Moxham uses Hubspot, as it “Flags duplicate contacts, makes it easy to see all your properties at a glance, and the type of field it is. You can then edit that field quickly by adding preselected dropdowns or checkboxes.”
Mariana Godsen of Remotish says, “We see a lot of cases where companies don’t have a process to bulk upload new contacts. That means several team members will upload spreadsheets formatted differently, without verifying if there is at least one right email associated with one first and last name.”
It can generate dupes, but “also, property fields that are essential to your business might not be filled, preventing you from filtering the contacts that were imported for a specific reason. You end up with a large database and being able to do very little with these records.”
To deal with this, Godsen recommends users “First of all, create a process for bulk importing, and assign a team member to be in charge. That can include having a sample spreadsheet your team can always refer to make sure they have the appropriate columns in place.”
“If the messy upload was already done, we do a series of exports and re-imports until we get to a list of records that are safe to delete. With the records that stayed, you can always do another round of export and import to update property fields.”
“We are HubSpot users, so we have a great process of cleaning up a spreadsheet following HubSpot’s best practices before bulk importing. When importing in HubSpot, the tool also shows you if there are errors in your spreadsheet, which usually are columns not matching properties, and rows missing information.”
“It’s very important to download the error spreadsheet and address it immediately,” says Godsen,”by fixing these errors and reuploading the remaining records that need to be fixed. A lot of people skip this step.”
Daniel Cooper of Lolly Lead Generation has seen a lot in terms of data issues. “As a lead generation company, we specialise in consulting companies on their lead generation efforts” which includes data cleansing. “One such UK corporate client who will remain anonymous approached us with concerns that their data was polluted within their CRM systems which meant they could not use this data for email marketing. Upon investigation, we found a dataset which consisted of over 1.1 million contacts with “Words in places of names, incorrect spelling of names, invalid emails, invalid phone numbers.” It even had the issue of profanity in the data.
“It was a serious issue and meant that they could not automate outreach for marketing purposes at the risk of doing more harm than good – no one wants to receive a piece of email marketing with the wrong name— or worse— profanity in place of the customer’s name.”
Said Cooper, “We got to work creating code (python based) to crunch the dataset: We collected a ‘profanity list’ and flagged all entries that looked, suspect, and using an English dictionary file we flagged any names matching entries in the dictionary. Then, any email or phone number that was invalid became flagged. A common name dataset was used against all names flagging misspellings.”
“Once we had the potentially polluted data (just over 276,000 entries) our manual process began fixing all of the entries by hand. The results of all of this hard work resulted in our client being able to reach out monthly to this large database without the risk of damage increasing their sales a whopping 18%. A lot of work, but a lot of reward for the client.”
The team uses Monday and Neverbounce API. “We utilize the advanced APIs in order to automate the process of confirming and adjusting data as it comes into our CRM in order to reduce bounce rates and incorrect name formatting.”
How long has your business been collecting prospect and customer information in your database? Probably as long as your company has been around. Most companies have customer records that they use in marketing campaigns that are years old.
Source: Neil Patel
If you have customer or company records in your CRM that are more than two years old, there is a very good chance that some of their information has changed since it was last edited.
Outdated data is one of the most common forms of bad data in CRM and marketing databases. Just take a look at these statistics from Neil Patel:
That’s a lot of data turnover each and every year. Worse, that’s a lot of incorrect data that you are using to fuel your marketing initiatives.
Installing data quality initiatives that help you to identify records within your HubSpot, Salesforce, Intercom or other CRM database that contains outdated information will help you to better serve your customers, save your team time, and ultimately run marketing campaigns that deliver a higher ROI.
Will Cannon of UpLead cites this as a problem leading to “poor outbound email/phone campaigns for your sales team.” Cannon recommends users “Use an email verification and/or contact database tool to clean your CRM data in real-time.” Cannon uses Accurata and BriteVerify to “Keep our CRM data clean and up to date. [It saves] our outbound sales team by having clean leads to work with.”
The cost of this is high, and preventable, according to Moaaz Nagori of Cloudlead. “Messy CRMs cost your sales, marketing, and overall revenue operations a lot of money.” As an example, Nagori offers, “A client of ours was contacting people who had left the organization for quite some time. This resulted in a waste of effort.” Nagori believes starting with a good process is important, “To combat data decay we initially screen and verify data so it’s most updated. Secondly, we always conduct a cleanup every 3-6 months. It always keeps our entire pipeline in check.” To do this, Nagori uses Cloudlead’s own formatting and verification tool, as well as Clearbit. “They improve our efficiency manyfold, but we also resort to manual cleaning which is still the most effective.”
The high rate of bounced emails is the main concern for Gill Walker of Opsis. “I use out of office responses and Linked In to update the data where possible. Where updated information is not available, I ensure that the email is removed so that the same bounces do not continue to occur.” Walker also uses Duplicate Detector and Excel to weed out bad data. “Excel has powerful data manipulation functionality.”
Melanie Hartmann of Creo Home Solutions points out the problem of outdated information within your analytics. “ It appears that we have many more active leads than we actually do.”
To fix this, “Once per quarter we review leads that do not have any tasks assigned to them. Then they are marked ‘dead.’ Or as sometimes happens, an active lead was not assigned a task. So for those, a task is added to put the lead back on follow-up.
Hartmann currently uses VLookup for these purposes but explains the company is “still looking for ways to do this more quickly/easily…Hopefully to save time by cutting down on the actions needed to complete the tasks manually.”
Gem Latimer of BabelQuest also recognizes the inflation issues that can come with bad data. When a CRM is “Full of unsubscribes and hard bounces, you just don’t have a realistic view of the size of your audience. This can impact email sends and how you’re segmenting your data.
Says Latimer, they handle this issue via “Regular clean-ups, by creating lists that update automatically and can be reviewed. Unbouncing any contacts who have been wrongly bounced, and deleting those who should no longer be in the CRM.”
The team likes Hubspot CRM for its “easy visibility and ability to make changes immediately. The team also use Google Sheets, which can “allow you to really drill into data, make changes and upload when clean.”
Data management is as much about upkeep as setup, says Jered Martin of OnePitch. “One common mistake we see consistently is not updating contact records with information from phone calls. Without vital information such as feedback or pain points, we aren’t able to provide more personalized service for our customers, or even have to ask them the same or similar questions repeatedly.”
To address it, “We ask each team member to take detailed notes for each call formulated through a standardized template which can then easily be input into our CRM.”
The team utilizes Salesforce Record Updates to help keep data fresh. “This helps us update contact or account records based on new information we collect through our direct conversations such as role changes, title changes, updated email addresses and more.”
Duplicate data is perhaps the most common form of bad data outside of simply outdated data. In some marketing datasets, duplicate data is the single biggest issue. Experts have found that duplication rates as high as 10%-30% are common in many marketing databases.
Duplicate data refers to any data that is contained in multiple records within a marketing database. When most people think of duplicate data, they think of exact duplicate records where all of the information is directly copied to a completely new record. That’s a very common form of duplicate data, often from errors that occurred when the data was imported and exported. They can also happen by mistaken manually entry, where a marketing or sales rep accidentally creates a new record for a prospect or customer that already exists within your database.
But duplicate data is so much more than exact 100% duplicates. The most common form of duplicate data in a marketing database is partial duplicates. Partial duplicates can be created the same way that exact duplicates are — from import, export, or human entry error — but often contain only a portion of the same data.
Karoline Kujawa of ClearPivot sums it up best. “Duplicate contact records. Ugh.”
One of the biggest issues with dupes is this: “Duplicate contact records can result in overlap of communication. For example, your sales representatives might reach out to the same person twice. Or your marketing email might be sent to the same person multiple times if they have multiple email addresses in Hubspot. So it’s important that whoever finds the duplicate record update accordingly.”
The good news, as Kujawa explains. “It’s actually very simple to fix it if you are using Hubspot (which we do). When you find the two records that are the same person you want to MERGE the two records into one. Hubspot will record this activity and then sales representatives can also leave notes clarifying for the team if they have reached out already, etc. Everything is managed within Hubspot and it allows us to record all activity in this platform.”
This can have an impact on geographic marketing, according to Maegan Revak from Lynton Web “Let’s say you are sending an email out to those in a certain geographical region but there are multiple State and Region fields. Which one(s) do you use? Removing duplicate fields helps avoid confusion in list segmentation and ensures you are reaching the right people.”
Ravak reminds us that “standardizing how to use certain fields across your marketing and sales teams is important,” and that “regular database audits can keep your data clean and accurate.”
Revak uses BriteVerify for this purpose. “These tools help automate the cleanup of messy databases and catch any data issues.”
Katie Norton of Concora points out a recurring problem with dupes: “No one is sure which record to edit, details will be scattered across dups and sales operations get very disorganized very quickly.” Cocora says that a sticky situation led them to Insycle. “Before Insycle, we would bulk export, manually merge, and re-import as well as merge as we came across them. Then we encountered a sync error that started creating hundreds of dups a day. Without Insycle, managing this problem while we worked on resolving the source of the error (which took about a month) would have taken many hours of manual work. Our CRM would have been out of commission for a month if not for [them].”
Knowing the source of dupes and incorrect data is one of the best ways to proactively maintain database health.
It can get pretty deep, according to Chris Handy of ClosedWon.com. “The most common mistake I see is that marketers have created seven different versions of the same form field. In many cases, it has to do with ‘preferred contact method’.”
“The problem with this information living in so many different formats and places is that it can never be used for segmentation across the entire database without accounting for all of the stray fields. When this occurs, there is no easy fix.”
Says Handy, “Unfortunately, you just have to roll up your sleeves and get to work. In platforms like HubSpot, you can create smart lists that help you organize the contacts in a logical way. Organizing via smart lists gets you to the point where you can run workflow automation to properly mark the “one true” field on each one.”
“This is time-consuming,” warns Handy, “but worth it. Creating a fail-safe list that checks for any records without the “one true” field helps you know when to delete the other ones forever. It’s worth it, but you then need to put a strict process in place on creating new fields for forms.”
Melanie Musson of 360quoteLLC makes duplicate data remedies a monthly practice. Says Musson, “The database isn’t streamlined if there is duplicate data, and it takes more time to wade through it. It’s also easy to have inaccuracies.” Musson says the regular practice of database cleansing is “very easy, it just needs to be done.” Musson uses the Duplicate Detector tool Dropcontact to clean things up.
“Sometimes, reps will create a new entry and only have a portion of the data on hand that exists in the other record. A customer might fill out a form a year apart, after much of their information has changed, resulting in two similar but different records. “
“Other times, records start as exact duplicates. Over time, only one of the two records will be updated with new data as it is collected, leading to two similar records with partial matching information. This can be confusing as your reps will have a hard time distinguishing what the right record is. Over time, the two records may receive their own updates and diverge, keeping your marketing and sales reps from having a single customer view. “
Duplicate data makes things harder than they have to be, points out Kimmie Champlin of Clutch.co. “When data is duplicated, your CRM is significantly harder to use. Both your sales team that rely on the CRM to manage their daily productivity and your automation efforts will struggle to find the most appropriate point of contact. Ultimately, emails get sent or calls get made to the wrong person or wrong place, resulting in lost opportunities.”
Like Musson, they use regular maintenance to combat bad data. “We have implemented a weekly process to check for duplicate records according to multiple criteria, not just according to email addresses or company domains. We use an external tool to identify and manage these issues, but it’s also clear that manual data processing is unavoidable. All users in our CRM are responsible for understanding what duplicate data is and reporting it so that it can be addressed by our team.”
Salesforce Record Updates and a Slack for communication to help keep CRM issues visible for the CRM team, and Champlin also lauds Insycle, whose app “allows us to quickly identify duplicates based on criteria we consider strong indicators of problems in the CRM. Using priority match rules, we can merge thousands of records at a time.”
Giselle Bardwell of Kiwi Creative mentions that it’s not always easy to tell what entry should serve as the source of truth. “CRMs sometimes have difficulty figuring out which piece of data is most recent or valuable and should supersede the other (e.g., multiple email addresses)”
The duplicates can cost you money in more than just ROI, Says Bardwell, “Our CRM charges by the number of contacts, so having multiple contact records for the same person can be costly! Outside of that, we’ve had situations where we’ve contacted the same person with the same offer multiple times because they were duplicated in our database. This has resulted in terrible customer experience and loss of a couple of prospective customers. As a result, we go through great lengths to clean up our CRM regularly.”
To address it, “We take the time each quarter to go through our CRM and look for duplicate and irrelevant entries. If the information between two cards is in conflict, we use the date of the last activity to show us which is likely to be the up-to-date email address, phone number, employer, etc. While this task can be time-consuming, breaking it up into smaller chunks helps. But, in the long run, it helps to keep your CRM accurate and efficient for all our lead engagement, marketing, and sales needs.”
The team does this using Excel and Hubspot in concert. “HubSpot has features that help us identify duplicates. Excel has the same capabilities and we are able to identify duplicates in bulk.”
David LaPlante of PropertyRadar says, “Duplicates wreak havoc on lifecycle and/or event/attribute specific decision-based messaging.”
LePlante gives an example: A “real” customer entry has a ‘customer’ attribute, and will only receive messages intended for customers; however, Duplicate #1 does not have a customer attribute and “receives messages treating them as if they are not a customer. The customer gets confused, offended, and takes an unintended action because of the erroneous message. Now you have to manually do damage control which is costly.”
Says LaPlante, “Duplicates are typically easy to identify by some data types. Some not as much and entity resolution is not always perfect. And events that cut across duplicates are messy at best to merge and timely messaging opportunities can be lost. Merging can be a real pain across an ecosystem of data-dependent applications, especially to automate and operate at scale. Some applications make this easy. Some do not. Their team uses Insycle, explaining how the app allows them to Identify, do the bulk of resolution automatically, and (if needed) resolve manually, though LaPlante points out manually is the least efficient, most expensive means of remedy.
Paula Skaper of Kinetix Media Communications Ltd makes an excellent point that bad data “can result in breaching spam laws unintentionally or not including warm leads in marketing outreach due to incomplete consent records. the list is pretty long of the things that can go wrong.”
Says Skaper of the remedy, “The approach varies based on the size of the list, the life-stage of the lead and the urgency of the cleanup. I.e. dirty subscriber data often isn’t worth cleaning if the subscriber isn’t engaging with content on a regular basis, and it’s more efficient to simply delete questionable records while on the other end of the spectrum, it’s worth having a human validate the data for high-value customer records.”
Kinetix uses InSycle, and list segmentation in order to find “at-risk” records. Says Skaper, “Excel/CSV files are an affordable way to extract, clean and bulk update CRM data for clients where the errors are repetitive, similar and/or the volume of data is relatively small and manageable; we also encourage the use of tags or some other flagging method to identify individual records that ‘need updating’ so that individual users can submit a record for review if they come across questionable data in the course of everyday use.”
However, the “the most valuable is the use of third party specialists who can programmatically clean a file to ensure consistent and complete address data for the bulk of records, reducing the manual labor needed to only the most incomplete/messy data.”
To reduce the occurrence of dupes before they multiply, Teodora Pirciu of Impressa Solutions says “We try to implement smarter ways to gather information with better forms. Also, fix things manually when possible.”
Denis Zhinko of ScienceSoft says of duplicates, “The main consequences are hampered sales efforts and decreased employee productivity, which may result in lost opportunities and wasted time. For instance, if a sales rep calls or emails the same customer multiple times because of duplicate info in CRM, instead of contacting different customers, it affects the rep’s productivity.”
Says Zhinko, “Our in-house Dynamics 365 team enables automated deduplication with duplicate detection rules that help to identify duplicate records. The team can also create custom data validation rules. They prevent the entry of data not compatible with the quality standards of the data in our company. They help to efficiently manage duplicate records and set up standards for data input, which allows preventing duplicates.”
Morgan O’Mara of Altvia Solutions says, “Using bad data can affect your entire organization’s efforts, from marketing to the accounting department. Using bad data not only reduces your outreach efforts, but it can also be bad for your reputation.”
To address the issue, “We have created company-wide processes around naming conventions. This clean data helps every team in our company. We have also created automated task reminders for particular house-keeping. For example, if we learn a user leaves a company, a task reminder is set for 4 weeks later to try and find out what company they work at now.”
The team uses Out-of-the-box Microsoft Dynamics 365 functionality (duplicate detection rules) and custom data validation rules along with Salesforce to achieve these clean data goals.
Improperly formatted data is very common and is often the result of manual data entry errors. When a customer fills out your contact form, if you don’t have the right validation requirements in place, you’ll quickly find that prospects and customers often enter information in different ways. The same can be said for marketing and sales reps that manually add data to your database. Over time, your data will be inconsistent, making it difficult to use in your marketing campaigns.
Some of the common ways we see improperly formatted data in marketing databases include:
Phone numbers are a perennial issue in data cleansing. Ben Demers of Atlantis cites it specifically in a response. “It’s impossible to call contacts using our VOIP system since it can’t read phone numbers in certain formats!” That makes for a lot of manual work, or worse, the potential for missed opportunities. “We use InSycle to automatically format our phone numbers to the correct format. We have the formatter run every hour.”
InSycle automatically re-formats our phone numbers to the proper format! One error that can become obvious right off the bat is names and salutations, according to Manpreet Singh Dhody of Ace Data. “Prospects or customers can really take offense if addressed incorrectly. The nature of our business revolves around creating personalized marketing messages. Identifying the right target audience becomes very crucial in communicating these messages. So as a process, we start with choosing our target audience and filtering them out. Verifying the basic data which will be used [in our marketing and sales] has benefitted us. So if we choose, let’s say ‘Suspects’ as a lifecycle stage, filtering them out and then checking if the Salutations are correct becomes important.”
The team uses Hubspot but also rely upon common-sense processes, saying that “Using a tool is simple, but having a plan on how to use it is something we ask ourselves.”
Standardization, therefore, becomes one of the most important ways to combat bad data. Jennifer Lux of LyntonWeb highlights how even small variations in data can cause wreak havoc on segmentation. “If you are using poor data for personalization and segmentation, you risk emailing the wrong person or using automation that doesn’t make sense to the recipient.”
“For example,” says Lux, “ we may want to segment the data by job title and use that personalization token in an outreach email. When we ask contacts to fill in job title, we may get errors in capitalization, and different formats that make segmentation tricky.” To remedy this, says Lux, “we can use logic such as ‘contains VP, Vice President, Vice Pres’ and other variations to ensure we capture the right audience segment for a targeted email to this potential buyer.” To assist in this process, Lux and the team at Lyntonweb have developed a custom API that allows them to update data in Hubspot. They also use CRMfresh. These apps “help us organize large amounts of data and automate changes.”
As mentioned above, phone numbers present a number of challenges. This is the case for Sam Orchard of Edge of the Web, who says, “We don’t validate on the front end because of the different possible formats and international codes people might enter – so we simply allow free text.” However, “As a result, some of our telephone numbers contain multiple phone numbers separated with slashes, or with qualifiers such as ‘Call after 6pm’.”
Says Orchard, “It makes it difficult to send any marketing text messages because there are so many telephone numbers in our CRM that wouldn’t be considered valid.” When considering how to resolve the issue, “One option is to develop robust validation for telephone numbers on lead generation forms. This would mean that we can rely on the phone numbers always being valid. Alternatively, the system we use to send our SMS marketing can be improved to be intelligent enough to extract the telephone numbers from the CRM and ignore the additional text or separators.” To do this, Orchard and the team employ a custom API. “As we have in-house developers, we often write our own tools to sanitize data and remove unnecessary whitespace or superfluous characters.”
These are just a few of the dozens of different data formatting and standardization issues that exist in any marketing database. Data standardization is important for consistency, efficiency, and the perception of your recipients.
Sometimes, data formatting is corrupted by moving it, such as an integration concatenating several fields of data into a single new field. Adam Bockler of ONEFIRE mentions this: “A common challenge with databases is that contact attributes (tags, categories, etc.) are jammed into one column when the data is exported from a system. This often happens, Bockler says, when moving their clients from manual systems into a “true CRM.” Says Bockler, “The attributes they’d like to segment their contacts by are all included in the exact same column, making it difficult to sort by those properties.”
“We have fixed this issue by creating columns for each specific property. Then, we’ve written a formula that parses the data in the exported column and separates it out into the new columns that we’ve made. We will use that spreadsheet to import the data into the CRM so that the properties will be up to date as soon as the data is imported.” Bockler uses Excel and Google Sheets to preview and amend data in this way. “Google Sheets allows our team to collaborate on the exported/combined data we look to import to the CRM.”
Michael Bibla of Atomic Reach says “Data points being inputted that you cannot report on, the result being “inaccurate reporting or engagement data.” To reduce the impact of this, Bibla recommends “Ensuring that all data points are standardized and reportable from an insights perspective.” To do this, Bibla uses Google sheets, which “help automate the process and allow for clean data entry.”
Tom Chanter explains a manual list approach that helps keep things organized. “My biggest issue is having different types of users stuck on one list.” This can make it difficult to send targeted emails. To combat this difficulty in a more tech-light manner, Chanter separated my list into three parts. “The first two are what everyone does–active users I email more frequently and less active users I email occasionally. But the third group was people who were influential. For them, I’d tailor emails individually because I tried it once and saw the impact immediately.” Chanter also employs Zapier and ZeroBounce to keeps lists high-quality.
Jeremy Cross of Team Building Seattle struggles with bad formatting when it comes time to send mail “Our sales team makes creative use of the name field, for example: appending the product that the lead is interested in.” However, the result of this? “Our marketing team occasionally downloads a contact list from our CRM that they will upload to an email platform. When the name or email fields don’t have accurate data it causes issues with email formatting and using variables in the email like ‘Hi NAME’.”
“We’ve moved toward automation of our CRM management, including using a tool called Zapier. When a lead fills out a form on our site, Zapier checks and modifies the formatting, and then adds the lead to our CRM. While this automation hasn’t completely eliminated the issue, it has improved it significantly.
Get ahead of the data, or it will get ahead of you says COFORGE’s Eric Melillo, who explains the consequences of not managing incoming data. “When I see… ‘Hey eric’ for an email intro, I know it’s all wrong. I opt-out and send for deletion immediately. It’s also easy to see when you look at your opt-out list and see the correlation.”
The fix? “The trick is to not allow data into your CRM until it’s been sanitized first. We use Zapier to fully sanitize and standardize all pieces of data such as email, splitting of first and last name, capitalization and several others. We use Zapier’s built-in ‘Formatter’ tool that will capitalize any string of text. So, we have Zapier monitoring our HubSpot CRM for new contact entries (manual or automated), then we run it through our Zap to cleanse our data. It’s magic and saves us a ton of time on our semi-annual data check.” Since there’s always room for improvement, Melillo says “We are still looking for ways to do this more quickly/easily” but enjoys that these tools “have built-in functions to help us sort, find and fix bad data.”
These problems crop up for “Josh Ames of Impulse Creative as well. “With most companies having some form of email automation in place having poorly-formatted data can, unfortunately, lead to a less than stellar user experience. For example; have you ever gotten an email with your name in all lowercase? You immediately know that it wasn’t sent as a 1:1 email.”
Says Ames, “Using Inscyle, solving for this is super quick and easy. Using the Transform Data feature under contacts on a regular basis ensures that we have properly formatted data and that we are continuing to provide our users with a great experience. I’m a huge fan of using templates to make this even easier on our team and our clients so they can easily update data without worrying about messing anything up.”
“We can validate the emails in our contact lists using Neverbounce and then run that same list through Insycle before importing into HubSpot to ensure that we have consistent, standardized data.”
Ellie-Paige Moore of The Bolt Way explains the human take on these careless errors. “It shows the customer that’s receiving the email that you haven’t taken the time and essentially, don’t care about them but are still wanting them to open your email.” Moore’s reaction is the same “ If I ever see a personalized email where my name is spelled incorrectly, I will delete it straight away without opening.”
The cost: “Not only have you lost the ability to market new products/services to them but you’re also increasing your churn rate of customers, which makes it harder for you in the long run to acquire new customers.”
It’s not a hard problem to fix through a good process. “To fix this issue, you simply have to be diligent when entering any personal details into your CRM system. If it’s your customers entering their details and they’ve misspelled, I always think it’s a good idea to send an email after a certain period of time to check that you have their correct details. This not only corrects any mistakes but also is showing you care about them.”
Reporting via Slack and searching with VLookup are both useful for correcting issues, says Moore. “If you need to de-duplicate or remove certain people that have a similar field or characteristic, then any one of these tools will be able to do this for you in a batch rather than you scrolling through your list and doing it manually.”
As Drew Cohen of SmartBug Media says, “What goes in messy, comes out messier. if messy data enters the CRM and isn’t cleaned up, personalization tokens or other forms of segmentation that occur in the marketing automation tool can’t possibly be accurate. This results in poor email marketing and lead nurturing campaigns.”
It doesn’t have to be hard, says Cohen. “Unfortunately, many try to overcomplicate a CRM/marketing database cleanup project. Most often, doing an export of all properties and identifying which properties have contacts with values will be an eye-opening experience. It allows a business to quickly identify unused or under-utilized properties, remove as necessary, and can be an incredibly helpful first step to maximize the efficiency of the cleanup effort.” Smartbug uses Insycle for its Hubspot CRM. “Hubspot properties can ultimately tell us what properties have data in them, and that is the kickstarter that tells us what we should do next — e.g. delete an unused property, combine underutilized properties, etc.”
Jack Moberger of Appcues sees many contacts not associated to companies because of a multitude of errors, resulting in reporting and lead routing issues. Moberger uses the roll-up-the-sleeves approach to “Manually associate contacts to companies based on company information via lookups.” However, Incycle makes this easier, eliminating most of the manual processes, with the exception of Excel.
Records that are missing vital data that can be a serious problem. Sometimes this can be due to validation issues. If you allow a prospect to fill out your contact form without including their first name, that field will be missing until you can manually update it.
Most records within HubSpot or Salesforce databases have missing data. Companies simply collect too much information about companies to ensure that every single field contains data for every person or company record. The important question is whether or not a record is missing critical data. Identifying and rectifying those records should be a part of any data quality improvement effort.
One commonly missing data point is owner assignments, according to Adrian Cordiner of Digital Rhinos. “Contacts such as MQLs/SQLs/Opportunities don’t have owners assigned to them when they should,” says Cordiner. As a result, “Contacts can fall through the cracks.”
“For example,” Cordiner explains, “if a contact is listed as an MQL but has no owner assigned, then often it won’t go through any escalation process, a salesperson won’t follow it up, etc. So it sits there without any follow up process, and ends up being a ‘wasted’ MQL.” To fix this, Cordiner “used to use Hubspot and create lists of any MQLs, SQLs or Opportunities with no owner attached, or run workflows to notify me of any contacts that had this issue occur. However, now I use Accurata which does this all for me automatically. Accurata will highlight any issues with the cleanliness of my Hubspot data, including MQLs/SQLs/Opportunities without an owner assigned.”
Missing email is another common issue, according to Rodrigo Rivas of Gray Group International, the result being poorly targeted campaigns. Rivas recommends “Filtering contacts that don’t meet the criteria to be part of the database” and using a CRM Training program like CRMfresh “To maintain a CRM organized and free of useless contacts.”
Sometimes, the evolution of your database creates blank data fields, points out Thomas Bosilevac of MashMetrics, who often sees “Incomplete customer information dating back to when you didn’t collect certain information.” Because of this phenomenon, “When you want to start turning your contacts into dollars, your segmentation will be wrought with errors impacting sales, personalization will look sloppy, and worse your delivery rates may start to decrease for email campaigns.” Jokes Bosilevac, “Worse case, you ask for Mrs. Robinson, and it is Mr. Ronsinson!” Bosilevac uses Clearbit to address these issues. “[It] is one of our favorites because you can easily integrate it into all your site forms to grab data at the source,” allowing users to “quickly and often automatically cleanse and append your CRM database.”
Bill Greene of Bike Lane Business Development reminds ups that bad data is a bad look when you’re trying to win engagement and business. Thus catching issues from integrations is important. “When [contacts] get put in via Zapier or another process, the names or the companies do not always get added, so it must be done manually later, and cleaned even later if not actively managed,” to avoid sending users “unsegmented and un-personalized emails” which Greene says “looks amateur.” Green actively manages it on a daily basis, using Excel to see duplicates more clearly.
D’Ana Guiloff of Stratsmark explains that this inconsistent data collection creates “Hours of time wasted that must be spent cleaning up data and system connections.” This can be avoided, Guiloff. “Clean up open text properties by converting them to specific types (check box, drop down, multi-select…). To do this, Guiloff and the team use HubSpot. “They give us a starting point to figure out where the breakdown of data occurred.”
Data management is a team effort, says Ruth Callaghan of Cannings Purple. While it means everyone is responsible, it makes for some competing perspectives. “Multiple users prioritize different things – so sales focus on the market cap of the prospect, the consultants worry about the contacts and relationships and marketing focuses on traffic if the contact and their engagement. It means no one team owns the responsibility of complete record keeping. Even a simple question like ‘what sectors are our clients working in’ becomes hard to answer. We found an enormous number of clients didn’t even have basic data beyond a name and phone number.”
It’s much easier to maintain good data than to play catchup. “It’s detective work,” says Callaghan, “and Insycle makes this easy. Here’s an example: we found lots of corporate records had a postal address but not also accurate data in the ‘state’ field or ‘country’ field. We could set up searches that isolated the right details in the postal address and then rolled these automatically into the other fields.”
Both Insycle and another app, PieSync, work well with their CRM, Hubspot. “It is a quick way of identifying the records we need. We use Insycle to clean and tidy and PieSync to ensure the master records are sent every way they need to go.”
That team effort is a lot about mindset, explains Mila Aldo from Databox— having seen the effects elsewhere “In one of my previous jobs, we struggled with old, un-useful data as we had multiple databases and no one fully responsible for managing and updating those. The problem was selecting the right contacts for the events & campaigns, not having control over the Customer’s base, so if the Sales Person left, there were old/not relevant contacts in place, which made it difficult for the existing employees as well for the new ones, to identify the right contact.”
Explains Aldo, “The CRM system itself doesn’t solve the problem. The problem was in the mindset of the salespeople, who wanted to keep control over owning the Contacts and make themselves irreplaceable. Very old-fashioned thinking. And the direction should come from the Management, although it was never completely implemented.”
Says Aldo, “The thing that helped was determining which system will be a single source of truth. The Marketing Manager became responsible for managing and updating the database, although there was a lot of work to be done in changing the behaviour to transform people’s minds and help them understand why transparency is important for the company.”
Gabriel Marguglio of Nextiny agrees, saying that “If you don’t have an established process in place for what fields are important to you and how they will be used throughout your marketing and sales processes, your team may begin creating their own processes with their own relevant properties. This is a nightmare for a CRM system. The problem with this is that data will not be updated on multiple fields and then segmentation and automation will fail.
To deal with this issue, Nextiny uses Insycle to clean up HubSpot CRM databases, a well an in-house solution.”Google Sheets helps when trying to export or import things. We can visualize what we have, fix some issues manually and make sure we have the info we need before starting the process.”
“Insycle automates the whole database cleanup process. Using Insycle for bulk updates saves you time from exporting data from a CSV (spreadsheet), using VLOOKUP to change the data, then importing it back. Instead, organize and segment data right in Insycle then change it on the same page. This not only saves you time, but it is also much safer than having to rely on the accuracy of the data in your spreadsheet… you can also automate cleanup processes to run on a weekly/monthly/quarterly basis.”
Organization is only as good as how it’s constructed, according to James Green at Offer To Close. “Poorly labeled data fields lead to misunderstood or misinterpreted data. In one of my last roles, we had a lot of internal confusion caused by poorly labeled data that led to bad business decisions that the data supported under one definition and opposed in the true interpretation of the data’s meaning.”
In the end, they needed to call in help. “A developer or database engineer needs to look at the source of the data and how it is derived and/or where it is entered so the true meaning can be understood. Once the data is properly understood,” explains Green, “the label should either be updated to match the data’s actual meaning or the way the information is collected/calculated should be changed to reflect the desired definition. Green uses manual methods such as Excel to “quickly identify data anomalies or duplication of data.”
That badly formatted data is unsightly and costly, according to Vinay Amin of Eu Natural “I like a clean format. Everything should be uniform. It just seems sloppy and unprofessional to me when it isn’t.” What’s more, “I end up spending more time looking for mistakes that might not even be there. If it looks sloppy in format I consider the possibility that it’s sloppy in other ways as well.”
The work is ongoing, says Amin, who sees success only “by closely scrutinizing and paying attention to detail. It’s time-consuming but has to be done.”
Amin uses Microsoft Excel and the Team Management program Monday to keep things in working order, pointing out that, “The variety of deployment is great. It’s available via install, mobile and cloud.”
As the data and insights we’ve collected show, most companies have some degree of bad data in their CRM or business databases. Collecting some bad data can’t be avoided. There are, however, some basic steps that any company can take to improve the quality of their data and limit the effect it has on their marketing campaigns and sales operations. Here’s a checklist we put together based on all of the feedback we gathered from our 67 experts.
You can’t clean up your databases until you identify problematic data to begin with. Start with a comprehensive audit of your existing data to get an idea of just how much bad data currently exists. An audit is important not just for identifying bad data, but for finding problems in your current data collection practices.
Insycle, a very popular tool amongst our respondents, offers a free, automated CRM database assessment.
One of the biggest reasons for bad data in your CRM database comes from manual data entry — either from your own internal teams or from customers. While some manual entry is unavoidable, you should try to limit it to the best of your ability. When it can’t be avoided, have strict validation rules in place for each field to ensure that the right data is in the right field, with correcting formatting.
You’ll find many other tools mentioned throughout this document. If you’d like to know more about any of them, we’ve put together a list of all those mentioned for easy reference: Accurata, BriteVerify. Clearbit, Cloudlead, Dropcontact, Duplicate Detector, Hubspot Deduplication & Workflows, Insycle (most popular by far), Neverbounce, Zerobounce.
In a CRM database with thousands of records, it is extremely difficult and time-consuming to try to clean the data by hand. That just isn’t a viable long-term solution. Downloading to and manipulating data in a spreadsheet is a common solution amongst our respondents, but not the best or easiest one. Instead, look for data cleaning software solutions to help you limit bad data in your CRM database.
Image Source: Data Cleaning Automation in Insycle
Bad data strangles your marketing campaigns before they can get off of the ground. It eats away at the time of your marketing, sales and customer success teams. This lowers your efficiency and impacts your ability to grow. However, with some careful planning and smart investments in data cleaning tools, you can reduce bad data and put yourself in a position to grow.
What are you doing to keep your database clean?
Each week, we share the best insights from our podcast interviews, original research articles, memes, and more. In 5 minutes, you’ll come away with actionable ideas you can use to grow your company, or career.
Sign up for our newsletter
Metrics & Chill Podcast
| Feb 1
Metrics & Chill Podcast
| Jan 25
| Jan 25
Latest from our blog
Popular Blog Posts
POPULAR DASHBOARD EXAMPLES & TEMPLATES