"Robots" is an affectionate name given to the search engine site crawlers that explore and index the Internet to create search engine results.
Sometimes, there are pages you don't want Google to index, and there are mechanisms that allow you to do this.
Search Entry Elements
Site Name
https://developers.google.com/search/docs/appearance/site-names#website
Troubleshooting
Alternate name
https://developers.google.com/search/docs/appearance/site-names#website:~:text=Provide%20an%20alternative,choice%20isn%27t%20selected.
Page Title, aka. Title Links
A title link is the title of a search result on Google Search and other properties (for example, Google News) that links to the web page. Google uses a number of different sources to automatically determine the title link, but you can indicate your preferences by following our best practices for influencing title links.
Influencing your title links in search results
https://developers.google.com/search/docs/appearance/title-link
An update to how we generate web page titles
https://developers.google.com/search/blog/2021/08/update-to-generating-page-titles
Sitelinks
Snippets
Hiding Static Pages from Google
Webflow has added a static page setting called Sitemap Indexing. When this is toggled off, Webflow;
- Excludes the page from your sitemap.xml
- Adds a META noindex tag to the page, so robots know to ignore it for indexing
This feature is found just beneath your title and description settings.
Be careful not to confuse this with Site Search settings...
Hiding Collection Pages from Google
Unfortunately, Collection Pages do not have this setting available.
There you can hide either a specific Collection Page, or all pages in a template, using these approaches.
Hiding all Collection Pages from a Template
Let's suppose you have a News collection, and you want to hide all pages that are generated from Google.
To do that, you can place a special META tag in the <head> custom code of that template page.
This is the tag you need;
<meta name="robots" content="noindex">
Hiding an Individual Collection Page
This same idea can be extended to allow you to hide individual collection items.
- Add an option field to your CMS collection
- Give it two values,
index
andnoindex
- Populate your items with the values you want
index
means it will appear in SERPSnoindex
means it will be suppressed from SERPS
- Make that field required
Then, in your collection page template's HEAD custom code area, drop in the META tag-
<meta name="robots" content="">
Inside of the content
attribute, between the double-quotes, insert your new option field.
You can now easily control each page's Google indexing individually.
What about robots.txt?
People often imagine that robots.txt is the answer to their Googlebot-exclusion needs, but it's typically not the right answer. Here are a few reasons why;
- It's easy to mess up robots.txt and break your site's SEO entirely
- Robots.txt tells bots what they are allowed to look at, and not what should be indexed. You may well see that page in SERPS, but with no title or description, just a URL.
- If the page has already been indexed, and you add it to robots.txt, the bot will be prevented from re-visiting it, and will never de-index it even if you add the META noindex to the page. It's not allowed to look at the page, therefore it can't see your META, therefore it doesn't act on it.