Posted 13 June, 2016 by in SEO

How Accurate Are Website Traffic Estimators?

If you’ve worked at an agency for any significant amount of time, and particularly if you’ve been involved in forecasting, proposals or client pitches, you’ve likely been asked at least one of (or a combination or amalgamation of) the following questions:

  1. 1) How much traffic can I expect to receive?
  2. 2) How long until I see X amount of organic visits?
  3. 3) What traffic will I receive from X investment?
  4. 4) What organic opportunity is available within our industry?
  5. 5) How much traffic do my competitors receive?

Forecasting is notoriously difficult, and done badly can be misleading or even damaging. There is a myriad of assumptions, caveats and uncontrollable factors that can mean that any predictions are nothing more than a finger in the air educated estimation. Organic forecasting is difficult enough to do, that writing about and explaining it would give me sleepless nights (Kirsty Hulse wrote a better post than I ever could on the subject), so I decided to focus on questions 4 and 5 of those listed above.

Imagine this very realistic (or even familiar) scenario; a client wants to know how much traffic their competitor receives, what the potential size and opportunity of their vertical is, and how to fulfil that potential by increasing visibility and acquiring more traffic. Before being able to accurately and insightfully answer the trickier final point there, it would be useful to know what you’re competing against. Assuming you don’t have access to your client’s competitors’ analytics data, it would be useful to get an idea of their organic performance, and ideally have confidence that the data you’re looking at is at best solid.

similar web

There are a number of really great visibility tools we use at Screaming Frog (Searchmetrics and Sistrix to name a couple of favourites), but these tools choose not to speculate on traffic, instead estimating visibility based on ranking position and keyword volume/value (which doesn’t necessarily correlate to traffic). There are a number of traffic estimator tools which obviously do speculate on traffic, as well as some well-known SEO tools that have functionality or components within their suites that do the same, which got us wondering a few things;

  • How accurate are traffic estimator tools?
  • Do they generally under or overestimate traffic?
  • Are there types of websites where their estimations are more accurate than others?
  • Are there potential reasons behind this under/overestimation?
  • Are there potential learnings for search marketers?
  • The Test

    We wanted to put their accuracy to the test, so here’s what we did:

    1) We took organic visit data for a range of 25 websites we have access to via Google Analytics, for the months of February, March and April 2016 (January can often be an outlier for many websites, so we just selected the most recent 3-month period of relative stability). We looked exclusively at UK organic traffic only (generally, but not exclusively the 25 sites selected are primarily UK focused, but some target multiple territories or even worldwide), because some of the traffic estimator tools segment traffic by region, and don’t always cover every territory. Similarly, not all the tools we used in the test deal especially well with subdomains, so we selected root domains in our analysis.

    2) While we can’t disclose the websites selected, to ensure as even a test as possible within what is a fairly small sample size, we specifically selected sites that covered a range of verticals, target audiences and purposes (more on that breakdown to come). For the same reason, we also selected sites that covered a range of different traffic levels, from those which receive millions of organic visits each month, to those with just hundreds of visits. We hoped this varied selection might show trends where certain tools are more or less accurate at estimating certain types of websites’ traffic levels.

    3) We analysed these 25 websites using 3 tools – SimilarWeb, Ahrefs and SEMrush. We recorded organic traffic estimate numbers for each of the 25 websites, focusing on exclusively UK traffic to match up with our GA data.

    4) We measured actual traffic against estimated traffic for each of the 3 tools. We measured this in a number of different ways –

    a. Visits difference for each site using each tool.
    b. Percentage of visits difference for each site using each tool.
    c. Overall visits difference for each tool.
    d. Percentage of visits difference for each tool.
    e. Average percentage difference for each tool.

    Predictions

    Before sharing the results I’ll share my one real prediction; the tools would almost certainly underestimate organic traffic. This is because these traffic estimator tools have limited indexes and only track a certain amount of keywords, so can’t possibly expect to completely accurately estimate traffic. Most don’t handle the long tail well as they simply don’t have the keyword bandwidth to do so. Furthermore, the tools, much like standard forecasting CTR modelling, also assume visit numbers by ranking position of keyword volume – they *don’t* consider keyword intent, brand vs non-brand, Google answer boxes, 9-pack & 7-pack results, the Knowledge Graph etc.

    The Data

    SEMrush

    semrush

    SimilarWeb

    simweb

    Ahrefs

    ahrefs traffic

    You can access the data here.

    The Results

  • Overall the most accurate tool analysed was SimilarWeb which on average overestimated organic traffic by 1%. It overestimated total visit numbers by 17%, estimating 15.7m visits for the 25 websites, compared to the 13.4m actual. SimilarWeb was the only tool to generally overestimate traffic. Ahrefs was the next most accurate, underestimating total traffic for all sites by 17% (11.1m estimated visits compared to 13.4m actual visits), and on average underestimating traffic by 36% percent. SEMRush underestimated total traffic for all sites by 30% (9.4m estimated visits compared to 13.4m actual visits), and on average underestimated traffic by 42%.
  • The site with the highest actual traffic (‘Charity 1’, 8.8m actual visits) was wildly differently estimated by the three tools; SimilarWeb – 12m estimated visits (+36%), Ahrefs – 8.1m estimated visits (-8%), SEMrush – 6.4m estimated visits (-37%). Generally speaking, Ahrefs was the most accurate tool for estimating traffic of high traffic websites.
  • The most underestimated websites for each tool were: ‘Charity 1’ by SEMrush (2.4m visits difference), ‘Health’ by SimilarWeb (725k visits difference), and ‘Health’ by Ahrefs (957k visits difference). The largest percentage underestimation was ‘Energy’ by Ahrefs, which underestimated traffic by 94%. The most overestimated websites for each tool were: ‘Travel 1’ by SEMrush (132k visits difference), ‘Charity 1’ by SimilarWeb (3m visits difference), and ‘Travel 1’ by Ahrefs (198k visits difference). The largest percentage overestimation was ‘B2B Products’ by SimilarWeb, which overestimated traffic by 128%.
  • Further Analysis

  • All three tools significantly underestimated traffic to ‘Health’, all by at least 500k visits. This site receives a lot of traffic from longer tail location specific phrases, most of which won’t likely have been picked up by the tools. Generally, the most accurately estimated sites were e-commerce, but there were many more of this type of site than any other. The most accurately estimated site across all three tools was ‘Ecommerce 7’, on average just 2% under actual traffic levels.
  • ‘Travel 1’ was significantly overestimated by Ahrefs and SEMrush, by 72% and 48% respectively. Anecdotally, the site in question competes well for visibility in very competitive search results that are dominated by big brands, but they themselves are not an established brand at all. This high overestimation confirms that despite good visibility, the site in question isn’t getting sufficient clicks from the search results, due in part to a lack of brand presence.
  • Generally, sites with lower traffic levels were less accurately estimated. There were three sites with under 10k organic visits (‘Insurance 1’, ‘Travel 2’ and ‘B2B Products’), and almost every estimation by each tool was at least 40% incorrect (either over or under). Only SimilarWeb got close to estimating traffic for ‘Travel 2’, underestimating by 6%.
  • Final Thoughts

    Our test is by no means a thorough mathematical or scientific experiment, merely a quick test to try and gauge the accuracy of such tools. There are a number of ways we could improve or increase our test, including but not limited to:

  • Increase the number of traffic estimator tools analysed.
  • Increase the number of websites analysed.
  • Use more precise analytical data (GA numbers tend to always come with a pinch of salt).
  • Use an even number of website type (or analyse just a single vertical and website type).
  • Increase the number of territories to US, Europe or worldwide.
  • Our rather limited test is just the tip of the iceberg, but has hopefully shown that all traffic estimators have strengths and weaknesses, and their level of accuracy can vary quite considerably.