Optimising Iframes for SEO

We discussed the use of iframes in this blog post two years ago, and our conclusion was that you should avoid them.

We revisited the issue recently, and undertook some tests to see if and how we could make iframes SEO-friendly.

The problem with iframes

Pages that have iframe tags display multiple URLs for one single page. You can even have an iframe page in another iframed page. Here is an example:

The html tag indicates the beginning of the code on a page and a conventional page should only have one. In theory, this means that the master page content containing all the iframe tags should be crawled including all associated iframed content. Therefore, link equity can be passed from a master page to an iframed page or vice versa.

In reality, however, there is still a risk that it will confuse search engine robots and result in the page content not being indexed as intended.

Google admit that whilst they try to associate framed content with the page containing the frames, they can’t guarantee that they always will.

In July 2017 John Mueller, Webmaster Trends Analyst at Google, further explained that it is not possible to control the crawl and indexation of iframed content. He said:

“In particular if a page is embedded within an iframe, within a bigger other page, then it’s possible that we will index that embedding page as well.”

There is also a concern that you are not in control of which content is crawled, and this could penalise your site in the long run.

We found that this lack of clarity was not ideal for our clients, many of whom use iframes. So we thought we would apply our standard scientific approach to the issue, and ran some tests in our Labs to see if we could find a solution that definitely works.

Test – can you control the crawl on an iframe?

In all the tests, the following terms are used:

Master page: a page containing an iframe tag
Iframed page or content: a page or content created that is placed on the master page as an iframe tag.

We identified four different techniques to test (‘no follow’ not being supported for the iframe tag):

Canonical tags
robots.txt
meta robots noindex, nofollow
Using an on-demand iframe.

Test one – using canonical tags

This is the method suggested by John Mueller from Google in the above-mentioned Hangout: using a rel canonical on the page, pointing to the actual content version that you want to index.

This is the step by step process for the test:

Step one

On domain one, we created a raw HTML page with a <title>, a <H1>, some <body> content and an image. We also added a self-referencing canonical. This is our iframed page:

[html]
<!doctype html>
<html lang=”en”>
<head>
<meta charset=”utf-8″>
<title>Loading GIF | Example of an animated GIF (Graphics Interchange Format)</title>
<meta name=”description” content=””>
<link rel=”canonical” href=”http://[domain one].com/gif/loading-gif.html”/>
</head>
<body>
<div>
<h1>What is a loading GIF?</h1>
<p>An animated GIF (Graphics Interchange Format) file is a graphic image on a web page that moves.</p>
<p>Below is a picture of a loading GIF.</p>
<img src=”http://[domain one]/wp-content/uploads/2017/05/loading.gif”>
</div>
</body>

[/html]

Step two

On domain two, we created a raw HTML with nothing on it except an iframe tag and a canonical pointing at the page on domain one. This is our master page:
[html]

<!doctype html>
<html lang=”en”>
<head>
<meta charset=”utf-8″>
<title></title>
<meta name=”description” content=””>
<link rel=”canonical” href=”http://[domain one].com/gif/loading-gif.html” />
</head>
<body>
<div>
<iframe src=”http://[domain one]/gif/loading-gif.html”
height=”500″ width=”750″ style=”border:0px;”></iframe>
</div>
</body>
</html>

[/html]

Results

With the setup above, in theory, our master page shouldn’t get indexed. Only the iframed page should be indexed because of the canonical tag.

Day one

The results in the SERP were very surprising, as both our master page and iframed page were indexed.

Master page

The title and meta description are included in this SERP. They are from the iframed page and the title is the <H1> whilst the meta description comes from the <body> content.

Iframed page

In this SERP, the title tag is honoured. There was no meta description set for this page, and again we can see that Google is using the body content to create it.

Day two

When we went back to the same pages the next day, the results were very different.

Master page

The page is not indexed anymore, the canonical has been honoured.

Iframed page

The page is still indexed. The title remains, but there is no meta description left. This is not surprising as it wasn’t specified in the first place.

Conclusion

John Mueller’s recommendation works. However, you need to have access to the other domain to add a cross domain canonical. As the first results were showing before the canonical was honoured, there was some cross-domain duplication. Without canonical tags, it would have a negative effect in the long run for both pieces of content.

The next part of the experiment will now test methods to block crawls on iframed content where you don’t have control of the other site. We have used Google Maps embedded as an iframe for these tests.

Test two – using robots.txt

The robots exclusion protocol is a good way of controlling pages and directories from being crawled by bots.

However, excluding the parent page from the crawl will have the adverse effect of deindexing it completely. So we created a second page containing only the iframe content and then blocked that page from being crawled.