Webflow's webflow.io
staging is an excellent tool for your site build and change process. It gives you an easy-to-use link to your published site during the build process, which looks like https://mysite.webflow.io
.
But what happens if Google indexes that staging site in search results?
For many designers, the staging site complicates their SEO plan when it is indexed in Google's SERPs, because those webflow.io
links compete with the branded domain links they're trying to promote.
This happens more often than you think. Currently, Google indicates that there are 1.46 million links that Google has indexed on webflow.io
sites.
Many of those links should be indexed... they are templates, or Made in Webflow demo and portfolio sites.
Also, many should not be indexed, and we want to remove them so that we can promote our custom branded domain properly.
When you don't want your webflow.io staging site to appear in Google SERPs, Webflow has a site setting to help you exclude these, which you can find under the SEO tab;
However that setting is Off by default. You have to remember to enable this when you first create your client project, in order to successfully block the Googlebots.
Unfortunately, if you miss that step, and discover that your webflow.io
site has already been indexed in Google's SERPs, switching on this "disable subdomain indexing" setting won't fix the issue - and the reason has to do with how Googlebot works.
Webflow's "disable subdomain indexing" feature switches on a robots.txt exclusion mechanism, which blocks robots from indexing your site.
It does not remove pages that have already been indexed.
In fact, Google's docs indicate your site will be "stuck" in search results, because your robots.txt prevents Google from crawling it again to obtain new information, like a meta noindex tag.
So your new site is live, and you're seeing webflow.io
results in Google SERPs alongside your client's official domain.
You may be able to see the specific webflow.io
URLs Google has indexed by doing a Google search for e.g.; site:mysite.webflow.io
How do you fix this?
GOAL: Remove the webflow.io staging URL from SERPs.
Google has a removals tool, which allows you to request the removal of specific URLs. This tool is part of Google Search Console, however to use GSC, you must first prove that you have ownership over a particular domain name.
While you don't own the webflow.io
domain, you may still be able to use a META tag verification approach to prove your ownership of your specific mysite.webflow.io
subdomain.
Google changes policies often. If this currently works, the process is;
https://mysite.webflow.io
https://mysite.webflow.io/
GOAL: Remove the webflow.io staging URL from SERPs.
Here's another approach;
Let's assume your staged site is at mysite.webflow.io. The mysite part of that is known as the shortname. Make sure to copy the shortname, you'll need this later.
Clone your site in the Webflow dashboard. You'll now have two versions, which I'll call REAL and FAKE.
On your REAL site, the one that has a custom domain;
On your FAKE cloned site, make these changes;
<meta name="robots" content="noindex">
This will take some time, days or weeks- but the next time Google crawls those webflow.io URLs it will find your FAKE site, see the NOINDEX meta tag, and remove your site from the Google's site index.
Meanwhile, promote your REAL site as normal. Setup GSC and submit your sitemap there to maximize the indexing visibility of your REAL site.
GOAL: Remove the webflow.io staging URL from SERPs, while preserving as much of the link juice as possible.
There is no guarantee how long it will take for Google to remove those webflow.io
entries. This may create a short-term SEO disadvantage if;
webflow.io
is highly ranked, even ranked above your custom domain If you want to maximize the value of your SEO in this situation, you may want to consider using redirects instead to transfer the SEO value to your new site more directly.
The approach to setting this up is similar to method #2, in that we're going to have two sites and that your FAKE site is going to have the mysite.webflow.io domain that you're seeing in Google.
Note, the common situation we use this in is when the homepage for your webflow.io site is the page that's showing in Google SERPs. This is easiest to setup in a single-page setup.
Again, let's assume your staged site is at mysite.webflow.io. The mysite part of that is known as the shortname. Make sure to copy the shortname, you'll need this later.
On your REAL site, the one that has a custom domain;
Now create a new, blank site, we'll call this the FAKE site;
<meta http-equiv="refresh" content="0; url=https://www.mysite.com">
Yes, Googlebot does recognize and follow META redirects, and treats them the same as server-side HTTP 301 redirects.
GOAL: Remove the webflow.io staging URL from SERPs. Optionally, preserve as much of the link juice as possible.
Google's Search Engine is good at executing JavaScript as part of the page preparation before it indexes the results.
This means that it's possible for you to make variations to the page depending on which domain it is being accessed through.
Your script can detect the hostname, and then when it's the mysite.webflow.io staged version, you can insert the noindex META tag;
Paste into your site-wide before-HEAD custom code area.
<script>
if (location.hostname.endsWith(".webflow.io")) {
document.write('<meta name="robots" content="noindex">');
}
</script>
Optionally you could add a META redirect as well;
<script>
if (location.hostname.endsWith(".webflow.io")) {
document.write('<meta name="robots" content="noindex">');
document.write('<meta http-equiv="refresh" content="0; url=https://www.mysite.com">');
}
</script>
Note-
robots.txt
allows indexing of your webflow.io site for Googlebot to see these. This includes setting your site's SEO settings so that Disable Webflow subdomain indexing setting is OFF. Beyond the scope of this article.
webflow.io
domainhttps://support.google.com/websearch/answer/11080680?hl=en
https://www.google.com/search?q=site%3Awebflow.io