When you track flawed data, you risk informing the wrong decisions. What should you pay attention to during a Google Analytics audit?
Analytics | Jul 27
Tamara Omerovic on March 4, 2021 (last modified on June 1, 2021) • 21 minute read
Out of the box, Google Analytics is capable of capturing a lot of valuable information. Sometimes too much.
Your website is your public face to the world, and all different types of audiences, from prospects to customers to media to robots, will visit. Each arrives with varying levels of intent but may come from the same sources and view the same pages.
The challenge for marketers is all of this activity gets mixed together in Google Analytics reports. If you’re focused on lead generation, then activity from bots, customers, and employees can skew the data that would otherwise yield valuable insights about what’s working (and what’s not).
So, how can you trust your data in order to gain the clarity needed to make the right decisions?
Keith Moehring and his agency, L2 Digital, specialize in cleaning up Google Analytics accounts and creating the reports marketers need to guide strategic decision making. We talked to him about this challenge and how to correct the problem.
According to Moehring, “For Google Analytics to be useful, the data needs to be clean, accurate, and segmented to isolate relevant traffic. Views, system settings, and filters need to be configured to strip away data that conceals marketing-related website activity.”
Moehring says when he evaluates a new Google Analytics account, there are ten things he looks for right away.
Is your Google Analytics data reliable?
One or more of these warning signs may be impacting your data. Learn more about L2 Digital’s approach to identifying and cleaning up your Google Analytics account
Before you get started cleaning up your data, Moehring recommends creating a new Google Analytics View dedicated to marketing.
“Once data is filtered out or cleaned up in Google Analytics, there is no bringing it back. It’s always a good idea to have one Google Analytics View with the raw data feed. It can serve as a valuable reference point to make sure any other filtered views aren’t missing anything important.”
Erratic and sudden changes to traffic patterns can often be the first sign that analytic data is not reliable.
For most bot-blocking sites, traffic patterns follow a pretty predictable pattern week to week (holidays notwithstanding). And, if anomalies do occur, they are expected because you sent out an email or you’re getting some press coverage.
However, extreme session fluctuations caused by bots or spammy websites will create dramatic spikes randomly.
According to Moehring, “Sometimes these bots will drive the same amount of traffic over and over to the point where your session timeline starts to look like Batman in a police lineup.”
Junk traffic sources will falsely inflate user and session data and negatively affect engagement data, like bounce rate, session duration, and pages per visit.
One test to determine if the traffic is bot or spam related is to go to the Audience > Overview report in Google Analytics. Below the date ranges, there’s an option to view the traffic timeline in “Hourly” increments.
According to Moehring, “When spammy traffic occurs, it typically happens over a short period (less than 1 hour). Real traffic spikes may jump up quickly, but sustain the traffic for at least several hours after.”
How to Fix: It’s difficult to prevent all bot or spam traffic from showing up in your reports because it comes in many forms. Isolating the cause can help you put a filter in place to block it from happening in the future.
Narrow your report’s timeframe to isolate the day a spike occurred, and then compare it to a typical day the previous week or month. The comparison data will quickly highlight any anomalies. For example, if traffic from Random City, Ohio is the only geo-location up 1300%, then that’s likely the source of your problems.
A couple of useful reports to try to identify spammy sources include:
A sudden, significant traffic drop or increase that is then sustained for an extended time usually has nothing to do with traffic coming to your site. More often than not, the culprit here is a change to the tracking code or your Google Analytics account.
Moehring says, “If your Analytics timeline looks like the edge of a plateau—relatively flat, sudden change, a new relatively flat—the first thing to check is that the tracking code hasn’t been stripped from a page and that no filters have been added to Google Analytics. In both cases, the change is immediate and won’t recover until corrected. This can leave huge gaps in your data, rendering it useless.”
This extension shows what Google tracking scripts are firing on your site, as well as any issues that exist. Make sure to check every section of your site because a change to one template will only affect the pages using it.
“One time, a client’s developer uploaded a new blog template that didn’t include the Google Analytics tracking code. The site saw a sudden and sustained traffic drop of 70% for several days. We were able to identify the issue and get the tracking code back in place, but the data for those days was gone.”
If the tracking code is in place, check that no filters were installed, inadvertently blocking a subset of data from Google Analytics. In the Google Analytics Admin section, there’s a link in the Account column called “Account Change History.” This can help you identify changes made on the day the sudden change occurred.
If you’re an organization focused on global domination, then all the countries that show up in the Audience > Geo > Location report are relevant to you. However, if you’re like many businesses and you target a smaller geographic market than Earth, take a look at how much traffic is coming from places outside your primary locations.
If you’re a company focused on the United States only, traffic coming from Australia, India, and other parts of the world isn’t irrelevant to your marketing efforts. Each user from these other countries muddies up your data.
How to Fix: Create a filter that strips out traffic from every country except the ones in which you’re interested. If you’re a local business, you can do the same for region-specific traffic.
To set up a geo filter, use the following settings:
* Avoid using City as the filter field here. It is too restricting, and sometimes the location of IP associated with a user isn’t the same as the city they are connecting from.
According to Moehring, “Traffic sessions are going to fall (sometimes significantly), but the thing to remember is we’re trying to isolate the traffic that matters. The big numbers are nice to see but don’t help you get closer to your goals.”
If you are internationally focused, consider setting up different Views for different business segments or marketing teams. For example, if you have one team focused on North America, set up a Google Analytics view that filters out all traffic outside of North America.
Bounce rate is often a misleading metric. It doesn’t account for those visitors who take the time to read your content thoroughly and interact with your page. This type of engagement can be a sign of qualified visitors that want a reason to take further action.
Most people understand a bounce to be when someone arrives to a site and then leaves without viewing any other pages. More accurately, a bounce is recorded when only one interaction takes place (e.g., page view).
By introducing new interaction tracking, you can get a much better sense of how many visitors arrive and find your content relevant enough to consume or interact.
How to Fix: The way to track interactions is with events. Events are dimensions and metrics you define that are recorded when someone takes a specific action. Some common events that can help improve bounce rate include:
The easiest way to add these events to your site is to use Google Tag Manager. According to Moehring, Tag Manager isn’t the most intuitive system to start using, but it can monitor and trigger events for scroll tracking, form submissions, video views, PDF downloads, exterior link clicks, and much more.
“Typically, once you install scroll tracking, a 70%+ bounce rate will plummet to below 40%. Also, time on page and session duration becomes more accurate because there are more timestamps to work with.”
On the other end of the spectrum, if your bounce rate is very low, like less than 4 or 5%, then it’s likely you have a different problem.
Hardly, if ever, will real traffic bounce less than 4% of the time. If you see this, what’s likely happening is your Google Analytics tracking code is installed twice on your site. This can happen when you’re using a tag management system or your CMS offers multiple ways to integrate Google Analytics.
When this issue occurs, the duplicated Google Analytics tracking script eliminates the chance for a bounce. In Google Analytics’ eyes, the visitor viewed a page and then viewed the same page again a split second later. Two interactions occurred, and thus no bounce.
According to Moehring, “Bounce rate isn’t the only metric affected by this. Session data will be close to double what it actually is, and other engagement metrics like session duration and time on page will be skewed. Your data will be unusable.”
How to Fix: If you’ve installed Google Tag Assistant on your browser, navigate to your site and launch Tag Assistant. The Google Analytics tag should appear green (or blue if using Tag Manager). If it detects the same tracking script twice, the Google Analytics tag will appear yellow.
If you click into it, you’ll see a message that reads, “Same web property ID is tracked twice.”
To resolve this issue, figure out where the tracking codes are embedded on your site and remove one. According to Moehring, “The right one to remove will depend on how they are added. For example, if you’re using Google Tag Manager (GTM), keep that one and remove the other because GTM gives you the most flexibility with how and when the tracking code is used.”
When checking for duplicate tracking code, make sure to review each section of your website. Tracking code can be installed on the template level, only affecting the pages using that template.
Bot filters don’t catch all bot traffic. If unimpeded, bot traffic will royally screw with user and session data, not to mention engagement metrics like bounce rate, pages per session, and conversion rates.
While it’s hard to know precisely where bot traffic will originate, two cities are often guilty—Ashburn, Va. and Boardman, Ore. If you go to the Audience > Geo > Location report in Google Analytics, drill into the United States and then switch to City view, to see where the top location’s traffic comes from.
“Ashburn, Va. is widely known as the data center capital of the world (a.k.a. homebase for a lot of bots). I’ve seen situations where 60% of site traffic was coming from Ashburn, and site managers had been using it for years to make marketing decisions. It’s a good way to go down the wrong path unknowingly,” says Moehring.
How to Fix: The best defense against this bot traffic is to implement the following filter in your marketing Google Analytics View.
Also, set up a similar filter for Boardman.
If you’ve been monitoring site traffic for any period, you’ve likely seen a spike or two come in from a site with the URL, freebottraffic.xyz, bot-traffic.icu or some other free traffic wording. This is known as ghost spam.
Ghost spam isn’t actually traffic to your site. What’s happening is spammers are randomly generating Google Analytics UA IDs, dropping them into the standard tracking code, and then blasting them with false data.
This wildly annoying practice tends to show up in Google Analytics as organic and referral traffic. You’ll notice these ghost spam sites often include their website address as the source, keyword, and/or landing page.
Huge spikes of spam traffic not only throw session data out of whack but because they don’t trigger any other interactions, bounce rate, session duration, pages per visit, and other engagement metrics are all skewed toward the negative.
How to Fix: It is impossible to predict what URL this type of traffic will be coming from. At the same time, these pests tend to use a lot of the same words, such as “bot,” “traffic,” “free,” etc.
To proactively defend your site against these attacks, consider adding a couple of source and page view filters that center on the commonly used terms.
*Please note, the filter pattern above is written in regex to protect against multiple ghost spam sites in one filter. If you’re not comfortable with regex, you may want to create separate filters for each term.
Word of Caution: Moehring stresses that you need to be careful when setting up these types of filters. “Make sure you do not set up filters to block traffic from relevant sources or to real pages on your site. For example, if you regularly post content about robots, a page view filter that removes traffic to URIs that contain “bot” will exclude a lot of your website traffic. Be careful to avoid these issues. Once data is filtered, it is gone and can’t be added back in retroactively.”
Pageview data can give you a sense of why visitors are coming to your site and where to feature offers to maximize conversions. The Behavior > Site Content > All Pages report in Google Analytics can also flag irrelevant site activity.
Moehring says a great example of this happens a lot with software-as-a-service sites. “In the upper right-hand corner, these sites often include a ‘Login’ link. Anyone clicking on that link and visiting the login page is likely already a user. As a result, their activity on the site is irrelevant to your lead generation efforts.”
Be on the lookout for pageviews that include any of the following:
Another issue to look for here is dramatic increases in pageviews relative to the rest of your site’s content. In some cases, the dramatic increases may originate from legitimate sources (e.g., email), but if you can’t quickly identify a source that spiked traffic to a page, there’s probably something else going on.
How to Fix: Fixing these issues depends on what’s causing them.
Google Analytics treats every unique URL as an individual page. Unfortunately, if your marketing campaigns use UTM parameters for attribution (e.g., ?utm_source=email campaign), the collective page views one page receives may be split among dozens of unique URLs.
In other words, you’re potentially underreporting marketing’s impact on generating pageviews and goal conversions.
According to Moehring, this mistake isn’t doing marketers any favors. “I’ve seen situations where client reports only showed 40% of the overall traffic we generated. The rest was split up among URLs with unique UTM parameters intended to associate an individual to an email click.”
As an experiment, go to the Behavior > Site Content > All Pages report in Google Analytics. In the search bar, type in a question mark (?) and hit enter. The number of pageviews is how many are not associated with a main version of the page. If you scroll down, the total page count is how many duplicated pages were recorded.
How to Fix: Google Analytics offers a built -fix for this issue (note there is a different fix for site search parameters – see below).
Moehring says, “Any UTM parameter added to the Exclude URL Query Parameters field will be stripped off the URL, which in turn associates any pageview data with the main page URL. This makes page view data far more accurate and gives you the credit you deserve.”
Word of Caution: Google Analytics will strip out query parameters before it has a chance to attribute the traffic to a source. If you’re using UTM parameters to properly attribute source data in Google Analytics reports, don’t add them to the Exclude URL Query Parameters field.
If your site or blog offers search functionality, a source of query parameters may be your site search. A UTM parameter with the visitor’s search query is often attached to the page URL after a search is conducted.
In your View Settings screen, there is a field near the bottom called “Site Search Tracking.” Toggle that on and then enter in the site search query parameter (e.g., “s” or “search”) into the field.
Directly below that field is a check box that enables you to strip out the query parameters. Make sure to check it.
Moehring says, “This feature also offers the added benefit of showing you what information visitors are looking for. If you see common queries, write content to speak to it or more prominently feature the content you already have.”
Referral traffic from another site is fun to see, but not all referrals are created equal. In fact, some aren’t even referrals at all.
If you navigate to the Acquisition > All Traffic > Referrals report in Google Analytics, the list of referral sites may include several non-marketing-related sources. Below are some of the more common, irrelevant referrals and how to fix each.
As mentioned earlier, ghost spam traffic will often take the form of a referral, and the spammy website URL will appear as one of your traffic sources. To fix this, you’ll need to set up a filter that attempts to catch as many of these sources as possible.
If you see your website or one of your subdomains as a referral source, you likely don’t have cross-domain tracking set up, or it’s not set up correctly.
In this situation, the cookie Google Analytics uses to track session activity is not passed between pages. In other words, Google Analytics doesn’t recognize a visitor as being on the site, so it creates a new session. This inflates session data and can significantly confuse your engagement metrics.
How to Fix: There are several steps to fixing this issue, including configuring your tracking script to allow for cross-domain tracking and updating your referral exclusion list. Check out our blog post, “How to Set Up & Use Google Analytics Across Multiple Websites” for more details.
If you’re using a marketing automation tool or a customer relationship management (CRM) platform, any clicks from the platform to your website results in the platform’s URL showing up as a referral source. For example, if you’re using HubSpot, you’ll often see app.hubspot.com as a referral source. Or, if you’re using Salesforce, you may see a variety of .force.com referral sources.
If you see these referral sources, then it’s most likely internal traffic and needs to be removed from your traffic reports.
How to Fix: The first step to fixing this issue is to set up a filter to block internal IP addresses. This includes your main office and the homes of any employees remotely working who may frequent the site.
To find your IP address, Google “What is my IP address?” The results will be a number like “126.96.36.199” or “2603:6010:9c33:a500:c437:cb9e:49f3:3856.”
Next, we need to set up a filter to block traffic from these IPs.
Out of the box, Google Analytics doesn’t always attribute email traffic properly. For example, if you do any email marketing, you may see “mail.google.com” as a referral source. As you can guess, this is a misattributed email traffic that came from Gmail.
Without email traffic properly attributed, evaluating the real performance of your email campaigns becomes difficult.
How to Fix: This can be updated in the admin section of Google Analytics.
Make sure to hit “Done,” and then “Save” at the bottom. The “Save” button can be easy to miss because you often need to scroll down to see it.
For years, Moehring has worked with organizations to clean up Google Analytics and he says he has yet to partner with a client who isn’t experiencing at least one of these issues.
“The good news is it’s never too late to put fixes in place. Configuring Google Analytics to filter out the noise is simple and straightforward once you figure out what’s cluttering up your reports. From there it’s all about recalibrating benchmarks and working off your new normal.”
Use this cleaned up Google Analytics data to:
Clean, accurate data that isolates your target audiences can be a powerful asset in the ongoing effort to improve traffic, lead generation, and sales performance.
Analytics | Jul 27
Analytics | Jul 5
Analytics | Jun 29