Advanced SEO Techniques

Removing Your WEBFLOW.IO Staged Site from Google

No items found.
Overview
Structured Data
Structured Data Overview
10:18
501
Adding JSON-LD Structured Data
Adv
16:29
502
Site Name JSON-LD
Adv
503
Articles, News & Blog Post JSON-LD
Adv
504
Site Search JSON-LD
Adv
506
Client Reviews JSON-LD
Adv
506
Breadcrumbs JSON-LD
Adv
508
FAQ Page JSON-LD
Adv
509
Q&A Page JSON-LD
Adv
510
Twitter Cards
Adv
601
Embedding Rich Text in JSON
702
Validating & Debugging JSON-LD
703
Technical SEO
Controlling the Robots
801
Sitemap.xml
802
robots.txt
803
Removing Pages from Google
804
Removing Your WEBFLOW.IO Staged Site from Google
805
More Advanced SEO Techniques
810
Voice Search
810
Influencing Google Search Appearance
Influencing Google's Search Appearance
901
No items found.

Webflow's webflow.io staging is an excellent tool for your site build and change process. It gives you an easy-to-use link to your published site during the build process, which looks like https://mysite.webflow.io.

But what happens if Google indexes that staging site in search results?

For many designers, the staging site complicates their SEO plan when it is indexed in Google's SERPs, because those webflow.io links compete with the branded domain links they're trying to promote.

This happens more often than you think. Currently, Google indicates that there are 1.46 million links that Google has indexed on webflow.io sites.

Many of those links should be indexed... they are templates, or Made in Webflow demo and portfolio sites.

Also, many should not be indexed, and we want to remove them so that we can promote our custom branded domain properly.

Preventing your webflow.io site from being indexed

When you don't want your webflow.io staging site to appear in Google SERPs, Webflow has a site setting to help you exclude these, which you can find under the SEO tab;

However that setting is Off by default. You have to remember to enable this when you first create your client project, in order to successfully block the Googlebots.

The problem

Unfortunately, if you miss that step, and discover that your webflow.io site has already been indexed in Google's SERPs, switching on this "disable subdomain indexing" setting won't fix the issue - and the reason has to do with how Googlebot works.

Webflow's "disable subdomain indexing" feature switches on a robots.txt exclusion mechanism, which blocks robots from indexing your site.

It does not remove pages that have already been indexed.

In fact, Google's docs indicate your site will be "stuck" in search results, because your robots.txt prevents Google from crawling it again to obtain new information, like a meta noindex tag.

How to remove your webflow.io site from Google's SERPs

So your new site is live, and you're seeing webflow.io results in Google SERPs alongside your client's official domain.

You may be able to see the specific webflow.io URLs Google has indexed by doing a Google search for e.g.; site:mysite.webflow.io

How do you fix this?

METHOD #1 - Google Search Console

GOAL: Remove the webflow.io staging URL from SERPs.

Google has a removals tool, which allows you to request the removal of specific URLs. This tool is part of Google Search Console, however to use GSC, you must first prove that you have ownership over a particular domain name.

While you don't own the webflow.io domain, you may still be able to use a META tag verification approach to prove your ownership of your specific mysite.webflow.io subdomain.

Google changes policies often. If this currently works, the process is;

  1. Setup a new GSC property
  2. Set it up for your specific URL, e.g. https://mysite.webflow.io
  3. Verify it using a META tag
  4. Add that META to your site-wide HEAD custom code area
  5. Republish your site
  6. Complete the verification in GSC
  7. Then use the removals tool to remove those URLs.
    • Use "Remove all URLs with this prefix" and enter your protocol and hostname, e.g. https://mysite.webflow.io/

METHOD #2 - Clone, Swap & Exclude

GOAL: Remove the webflow.io staging URL from SERPs.

Here's another approach;

Let's assume your staged site is at mysite.webflow.io. The mysite part of that is known as the shortname. Make sure to copy the shortname, you'll need this later.

Clone your site in the Webflow dashboard. You'll now have two versions, which I'll call REAL and FAKE.

On your REAL site, the one that has a custom domain;

  1. Change the shortname mysite of the published site to something different.
  2. In the REAL site's SEO settings, make certain that the Disable Webflow subdomain indexing setting is ON. You DO NOT want this version or your webflow.io site to be crawled, since you'd create the same problem you're fixing now.
  3. Republish your REAL site.

On your FAKE cloned site, make these changes;

  1. Change the shortname TO mysite, so that it matches what you're seeing in Google's searches.
  2. In the FAKE site's SEO settings, make certain that the Disable Webflow subdomain indexing setting is OFF. You want this version to be crawled, so that Google will remove it.
  3. Add the below META tag to your site-wide HEAD custom code.

<meta name="robots" content="noindex">

  1. And then publish your FAKE site to staging.

This will take some time, days or weeks- but the next time Google crawls those webflow.io URLs it will find your FAKE site, see the NOINDEX meta tag, and remove your site from the Google's site index.

Meanwhile, promote your REAL site as normal. Setup GSC and submit your sitemap there to maximize the indexing visibility of your REAL site.

METHOD #3 - Clone, Swap & Redirect

GOAL: Remove the webflow.io staging URL from SERPs, while preserving as much of the link juice as possible.

There is no guarantee how long it will take for Google to remove those webflow.io entries. This may create a short-term SEO disadvantage if;

  • Your webflow.io is highly ranked, even ranked above your custom domain
  • You're losing traffic stats you need

If you want to maximize the value of your SEO in this situation, you may want to consider using redirects instead to transfer the SEO value to your new site more directly.

The approach to setting this up is similar to method #2, in that we're going to have two sites and that your FAKE site is going to have the mysite.webflow.io domain that you're seeing in Google.

Note, the common situation we use this in is when the homepage for your webflow.io site is the page that's showing in Google SERPs. This is easiest to setup in a single-page setup.

Again, let's assume your staged site is at mysite.webflow.io. The mysite part of that is known as the shortname. Make sure to copy the shortname, you'll need this later.

On your REAL site, the one that has a custom domain;

  1. Change the shortname mysite of the published site to something different.
  2. In the REAL site's SEO settings, make certain that the Disable Webflow subdomain indexing setting is ON. You DO NOT want this version or your webflow.io site to be crawled, since you'd create the same problem you're fixing now.
  3. Republish your REAL site.

Now create a new, blank site, we'll call this the FAKE site;

  1. Change the shortname of your new FAKE site TO mysite, so that it matches what you're seeing in Google's searches.
  2. In the FAKE site's SEO settings, make certain that the Disable Webflow subdomain indexing setting is OFF. You want this version to be crawled, so that Google will remove it.
  3. On your one page site, drop your logo or a "please wait..." message, something to display while the redirect is happening.
  4. Add the below META tag to your site-wide HEAD custom code.
    Make certain to change the redirect URL to your custom domain.
<meta http-equiv="refresh" content="0; url=https://www.mysite.com"> 
Yes, Googlebot does recognize and follow META redirects, and treats them the same as server-side HTTP 301 redirects.

METHOD #4 - Javascript META Hack

GOAL: Remove the webflow.io staging URL from SERPs. Optionally, preserve as much of the link juice as possible.

Google's Search Engine is good at executing JavaScript as part of the page preparation before it indexes the results.

This means that it's possible for you to make variations to the page depending on which domain it is being accessed through.

Your script can detect the hostname, and then when it's the mysite.webflow.io staged version, you can insert the noindex META tag;

Paste into your site-wide before-HEAD custom code area.

<script>
    if (location.hostname.endsWith(".webflow.io")) {
        document.write('<meta name="robots" content="noindex">');
    }
</script>

Optionally you could add a META redirect as well;

<script>
    if (location.hostname.endsWith(".webflow.io")) {
        document.write('<meta name="robots" content="noindex">');
        document.write('<meta http-equiv="refresh" content="0; url=https://www.mysite.com">');
    }
</script>

Note-

  1. You will need to make certain your robots.txt allows indexing of your webflow.io site for Googlebot to see these. This includes setting your site's SEO settings so that Disable Webflow subdomain indexing setting is OFF.
  2. Make sure you don't already have another conflicting robots meta instruction in your page HEAD custom code, or Google won't know which to follow.
  3. Although it's very convenient ( single site ), this Javascript-based approach is not considered as strong as using literal META tags.

More Methods

Beyond the scope of this article.

  • Reverse proxy
  • Contact Webflow support, they may be able to issue a removal request, since they own the webflow.io domain

Resources

https://support.google.com/websearch/answer/11080680?hl=en

https://www.google.com/search?q=site%3Awebflow.io

Table of Contents
Did we just make your life better?
Passion drives our long hours and late nights supporting the Webflow community. Click the button to show your love.