Shai Aharony | 16 May 2017
SEO Guru; prodding Google’s algorithms just for fun
Traditionally, Web Developers and SEOs had a fickle relationship. Both being integral to reaching the aim of the client but neither want to sacrifice their goals in a compromise with the other.
The web designer has always been concerned with making the site aesthetically pleasing for the user while the SEO has always given Google crawlers the higher priority. Necessity bred compromise and methods that would aim to bridge the two have been established and used throughout many of the websites we see today.
One of these issues where compromise was needed was on page content. On the one hand, text was always seen as a necessity by SEOs who were tasked with ranking pages for specific key phrases while on the other hand, text was always seen as a hindrance from the web designer’s perspective. Something that should be reduced to the bare minimum so as not to spoil the visual experience of the site in question.
6 Months ago, Reboot Marketing had a conversation with a web developer which many of the readers here would find familiar. We wanted some of the hidden text on the page to be visible while they were adamant to keep it hidden adding that they don’t believe that it would make any difference as “Google reads the code anyway”.
In 2014, google enabled a fetch and render tool which has provided us with an important indication on the direction Google was taking. It seems Google was no longer content with just reading the code of the page, it actually rendered it so that it could “see” the page as a human would see it. This led to the obvious conclusion that if Google can ‘see’ that text is hidden, why would it give it as much weight? The problem was that the argument (just like the one we had 6 months ago) was one of opinions. There were no hard facts or studies to show this so we decided to create one.
Its important to understand that running a controlled experiment on an algorithm that is so complex is nearly if not completely impossible. Thousands of variables are at play and it’s our job to try and minimise those variables to the best of our abilities. By doing so, we have to restrict the questions we may be able to answer to their simplest form. Only by doing that we may get some meaningful answers.
For the purpose of this experiment, we invented a new phrase and ensured this phrase is not recognised in Google prior to the experiment. The phrase is: [andefabribiles] which is a fictional name given to a new type of bacteria.
Here is the result of that phrase on Google prior to starting the experiment:
We decided that we are most interested in how Google ranking algorithms behave in these 4 scenarios:
Here is how Google sees the page:
We chose those scenarios as these are the most common way web designers deal with space/design constraints. Its important to note that we have avoided having the key phrase [andefabribiles] on the first paragraph. This is done on purpose so that, on the hidden text pages, the key phrase is not initially visible.
We purchased 20 new domains ensuring that each domain name returns no results in Google or have any previous history of registration. On each domain, we built simple yet slightly different website to ensure minimal footprint. We then divided the group of 20 into 4 to test each of the cases above in this manner:
Each one of the 20 sites contained approx. 400 word descriptions. The content will be very comparable in structure and length across all of the sites. To reduce variables to the minimum, we ensured that Keyword positions are all in very similar locations across all of the content and in all of the sites, the keyword is mentioned 3 times only on each site. In all cases, the keyphrase was only mentioned for the first time in the second paragraph. In the case of the 3 hidden text groups where only the first paragraph is visible, this keyword was not visible by default.
1. All domains had all robots blocked until content was ready and published. This ensured that the domains are not crawled by Google at substantially differing times.
2. All sites had non-duplicated but similar title tag structure. Again, care has been taken to ensure keyphrase is mentioned consistently at the end of the title tag across all sites.
4. We searched for the key phrases and recorded results with images.
5. We have monitored progress of ranking over the next few months.
1. We minimised all related search activities and site visits.
2. We ensured all activity was carried out via incognito/private mode.
3. We did not click through Google search results whilst monitoring.
4. All sites released in a zig-zag fashion to ensure even spread to any variables.
5. All sites released from new IP addresses and locations. Each IP had its own Search Console account to fetch from.
6. Apart from myself and one other person within the company, no one else externally knew the actual domains or test phrases. This was done to ensure there was no contamination of data by unprotected clicks and views.
As you can imagine, ranking data for 20 sites over a 6 months period of time which produced just under 3300 data points was quite a challenge to visualise. We wanted to include all of the data while allowing the reader to select differing website groups in one location to easily demonstrate the relationship between those groups.
Having 20 separate lines across 6 months would just be an overload of data so we decided to average out the rankings for each group and by doing so condense 5 lines into on.
We have also included separate graphs for Bing and Yahoo as a reference. We think you will find the contrast between those two search engines and Google very interesting.
Outlier Results: For an unknown reason, we have one site that reacted very strangely throughout the experiment so we removed that result from the graph. The removal of that one site out of 20 has not affected the overall result. You can see the full data set including the anomaly in the file download at the end of the experiment and also in the final detailed graph which includes it.
Interactive Graphs – You are able to select each group along the top of the graph and also zoom in to any particular date with more resolution for that period by dragging on the relevant date range in the horizontal bottom date bar.
The below graph is for the same data but from Bing
The below graph is for the same data but from Yahoo.
It’s very interesting to note the chaos that is demonstrated by these two engines in comparison to Google which would indicate they have no weight preferance for visible text and would also hint at the superiority of Google’s algorithms.
For those interested in the full-blown data, I have included the excel sheet here for you to download.
And for those wanting to see the behaviour of each individual site rather than the average of the group:
What’s more surprising is the way that Google treats text in a Textarea as if it was fully visible. As the use of Texarea to display text in web design is almost non-existent these days, this is a bit of a moot point but interesting nonetheless. I you have any theories for this behaviour, please feel free to leave them in the comments below.
Special thanks go to our freelance data visualisation expert Kristian Vybiral who has taken the time to get the graphs exactly how we wanted them.