fbpx

Masterclass: Optimising Salesforce Commerce Cloud (SFCC) Sites for SEO: Part 2 -Website Crawling & Redirect Tools

In Part 1 of our guide on optimising Salesforce Commerce Cloud (SFCC) sites for SEO, we explored the essential aspects of hostname configuration, location mapping, and URL structure. Now, in Part 2, we’ll dive deeper into another crucial element of SEO: website crawling and redirect tools.

Ensuring your SFCC-based site is crawlable by search engines and utilises properly managed redirects is critical to maintaining and enhancing your site’s search visibility. Without the right setup, you could encounter issues like crawl errors, duplicate content, and lost link equity, all of which can harm your rankings. In this instalment, we’ll walk you through the tools and strategies within SFCC that help you optimise your site’s crawling efficiency and manage redirects effectively, ensuring your site is not only visible to search engines but also delivers a seamless experie

Website Crawling & Redirect Tools

Implementing efficient crawling and indexing strategies is crucial for all websites, but is particularly essential for eCommerce sites. This is due to the scale these sites can have, with potentially hundreds of thousands of pages generated via hosting products, categories, and filtered pages as a result of the faceted navigation, alongside additional pages for customer service, blogs, brand entity assets and more.

Ensuring you harness all tools available within SFCC to aid crawling and subsequent indexing, will help to maximise crawl budget, reduce index bloat and ultimately provide a better user experience for your customers.

Sitemaps

We all know that sitemaps are key for indicating which URLs are prioritised by search engines when crawling a website. These are particularly essential for large sites, like most eCommerce sites, which often have a large number of discoverable URLs via internal links.

One of the many perks of SFCC for SEO is that sitemaps are auto-generated by the system, without the need for custom development. This means no manual work is needed to update any URLs; the system will automatically detect these based on your URL rules and/or manual overrides, and fetch the correct data accordingly. You can also create separate sitemaps for different page types if required, for example splitting out sitemaps for products and categories from the rest of your website pages.

To configure your sitemap settings, navigate to Merchant Tools > SEO > Sitemaps and then the Settings tab.

Last modified <lastmod> fields will automatically be applied to the generated sitemap, and are crucial for search engines when determining which URLs to focus on with each site crawl, prioritising most recently updated pages in the crawl schedule. 

You can additionally set the change frequency <changefreq> and priority <priority> fields to populate against URL types, however Google confirmed that these are now ignored in sitemaps, and so it is no longer necessary to configure these.

You can also set a number of links to populate per sitemap (this is capped at max 50,000) and which types of products to include. We recommend setting this to only available products; this will ensure that any products which are made offline or are out of stock are not prioritised in crawls for indexing. 

It’s also possible to implement hreflang tags via ticking the ‘Include Alternate URLs’ checkbox. This will include hreflang tags within the standard sitemaps generated. However, depending on the scale of your site and the number of locales featured, this may put the number of links per sitemap file over the threshold. If this is the case, it might be better to create custom sitemaps to implement hreflang tags. This will need to be done via custom development through a solution architect.

Important to note: You can choose to implement as many custom sitemaps as you like, however custom sitemaps will NEVER replace the standard sitemaps. These will only ever be supplementary, so we’d recommend ensuring any custom creations do not conflict with the standard setups created by SFCC.

Remaining settings for sitemaps to be aware of:

  • Included Locales: you can set this within the main settings tab for Sitemaps. You should ensure all enabled, priority locales are ticked for sitemap inclusion. You may want to consider unticking any locales that do not need to be indexed or frequently crawled, but are purely there for user experience considerations. This may be ideal for sites that feature primary language folders, i.e. EN,  on every ccTLD, to prevent geo-cannibalisation.
  • Pipelines: in this tab you can select which pipelines (AKA URLs mapped to pipelines via your URL rules) should be included in the sitemaps. You should utilise SEO best practice when deciding which pipelines to include, for example, we would recommend including Store-Locator and Home-Show pipelines, but not including Checkout-Begin or Account-PasswordReset pipelines.(Always double check pipeline namings on your SFCC setup, as these may be custom and therefore different to those we have referenced here.)
  • Jobs: this is how you can set the frequency of your sitemap regeneration. We’d recommend selecting these to regenerate daily in order to ensure your sitemaps remain consistently up to date. You should also select this to happen at a time when site usage is low; this is to counteract any impact the job may have on site speed for users (if applicable).

Robots.txt

A robots.txt file is used to tell search engine crawlers which URLs they can access on your site, and should be used to prevent crawling of URLs that may overload your website. The ideal robots.txt file for a site will be minimalistic; meta tag directives such as canonical tags and no index tags are preferred for controlling which pages are served in the SERPs, however it is still best practice to include a robots file. This can also be used to link crawlers to your sitemap(s).

Important to note: Robots.txt files are typically edited on the staging environment, by selecting which environment to apply these to. They are then automatically replicated to production and development environments. If you are unsure which settings are automatically replicated between environments, check your Data Replication settings within the Administration menu on SFCC Business Manager.

To configure your robots file, navigate to Merchant Tools > SEO > Robots.

For staging and development environments, it’s best to use the file from Deployed Cartridge. This is because these environments should typically be set as non-crawlable and non-indexable. While you can do this using a disallow directive within the robots file, the cleanest way to ensure this is to password protect those versions of your site.

To check if the site is password protected, check in the settings via Administrations > Sites > Manage Sites. Select your website and navigate to the Site Status tab. If the site is password protected, it will show as Online (Protected), and contain the relevant password details. 

For the production instance (or live version) of your website, it’s best to use a custom robots.txt file. This will ensure you can configure the crawl settings to best suit your site needs. You can also incorporate links to your sitemaps as required.

The robots files are generated at domain level, not subfolder level, so if your website utilises multiple subfolders for different locales, you’ll need to ensure the robots file is configured to match the expectations of each locale within that domain.

URL Redirects & Dynamic/Static Mappings

It’s inevitable with an eCommerce site that at some point, you’ll need to implement URL redirects. This may be due to changing URLs within your current setup, altering your hierarchy structure, or when migrating domains or platforms

SFCC has a few different methods which can enable redirect implementation, all of which are fairly straightforward and tailored to certain requirements.

  • URL Redirects: 
    • Ideal for redirecting current (or legacy URLs) on a one-by-one basis, or for redirecting between existing categories / products within the catalogue
    • Can include more than one  wildcard in the URL.
    • Cannot apply redirects at locale level (will apply to all locales with that URL pattern / category ID (CGID) after the domain/subfolder)
  • Static Mappings:
    • Ideal for redirecting URL patterns (legacy only) to pipelines/controllers or fixed categories/URLs. 
    • Legacy URLs can be from either SFCC or a legacy CMS/site.
    • Can include one wildcard in the URL, either at the start or end of the specified path. 
    • URL being redirected from must be unknown to the current system.
    • Can apply redirects at locale level.
  • Dynamic Mappings:
    • Ideal for redirecting URL patterns (legacy only) to pipelines/controllers or fixed categories/URLs. 
    • Legacy URLs can NOT be from SFCC – this only works for URLs that existed on a legacy CMS/site.
    • Can include more than one wildcard in the URL, either at the start or end of the specified path, or both.
    • URL being redirected from must be unknown to the current system.
    • Can include one wildcard in the URL. 
    • Can apply redirects at locale level.

First however, it’s important to understand some built-in features within SFCC, that may remove the need to manually implement any redirects at all.

Automatic Redirects via Built-in SFCC Functionality

This functionality occurs for a few different scenarios on Business Manager / user inputs:

  • Manually overriding URLs on PLPs will trigger redirects from the legacy, auto-generated URL via the implemented rule, to the new one
    • For example, your PLP rule is set as: [ [ constant, c], / ,[ attribute, name ]
    • For a category with the Attribute Name set to ‘Accessories’, this will auto-create a URL for this page of domain.com/c/accessories/ (trailing slash assuming this setting is ticked in URLs)
    • This URL can be manually overridden in the PLP settings. In this case, we want to override this to show home-accessories as the page path. We’ll therefore implement a manual URL of c/home-accessories (we do not need the initial slash or the trailing slash, as the system adds these automatically based on our pre-configured URL rules).
    • The system will then trigger a redirect from domain.com/c/accessories/ to the new URL domain.com/c/home-accessories/ automatically.
    • If the manually input URL is changed at any point, however, a URL redirect will be required to redirect the old manual URL to the new one.

  • Manually overriding URLs on PDPs will trigger redirects from the legacy, auto-generated URL via the implemented rule, to the new one
    • This works in the same way as the above. 
  • Misspellings of PDP URLs, providing the product ID portion of the URL is correct, will auto-generate redirects to the canonical URL for that product
    • The PDP will always auto-apply a product ID to the end of the URL, so any URL containing this, no matter the pre-emptive, will redirect to this URL, providing that part of the URL is in tact
    • For example, a product has the URL of domain.com/category-name/p/product-id.html 
    • Inputting into the search bar, domain.com/anyrandomtext/p/product-id.html  will trigger a 301/308 redirect automatically on the system to the correct URL domain.com/category-name/p/product-id.html 
    • We can see working examples of this on sites that utilise SFCC, for example M&S.

Important to note: The way automated redirects apply on SFCC may differ from business to business, depending on the codebase that your setup resides on. We have seen instances where a migration from SFRA to Vercel prevents these redirects from occurring automatically. We’ve also seen instances where custom configuration can interfere with the redirect functionality. It’s therefore essential to test these implementations on the staging or development environments of your website first, to understand how your setup works and whether this is true for you or not.

URL Redirects

To implement or manage a URL redirect via this method, navigate to Merchant Tools > SEO > URL Redirects.

You can implement a redirect from a URI, or a set category/product/folder/content page from within the catalogue, to set URL or category/product/folder/content page  within the catalogue. URIs/URLs do not need to include the site name or the locale.

Selecting a non-URL format to redirect to will help to future-proof created redirects and avoid redirect chains if the URL is to change again.

You can additionally transmit parameters from the source URL to the destination.

Redirect types available are 301, 302 or 307. For permanent redirects, we recommend utilising a 301 redirect (or 308 if altered via custom development), while for temporary we recommend 307 redirects.

Static Mappings

To implement or manage a URL redirect via this method, navigate to Merchant Tools > SEO > Static Mappings.

Rewrites implemented this way have to follow specific formats.

To rewrite to a static resource (images, txt files etc) use this syntax:

<legacy URL> [i] s,[<protocol>],[<host>],[<unit>],[<locale>],<path>

To rewrite to a dynamic pipeline (PLPs, PDPs etc) use this syntax: 

<legacy URL> [i] p,[<protocol>],[<host>],<pipeline>[,<locale>][,<parameter name>,<parameter value>]*

You must observe all parts of the syntax, even if they aren’t relevant to your rule. I.e. even if it is not applicable to include a protocol in your rule, this still needs to be denoted by a ‘comma’ in between this section and the [<host>] section of the syntax.

The [i] syntax is optional and indicates that the match does not need to be case sensitive. The ‘p’ and ‘s’ elements indicate whether the syntax is static or dynamic, and must be included.

For example, directing a legacy URL to a CGID of home-accessories on the current catalogue may look like the following using static mappings. The wildcard on the legacy  URL ensures that any parameters or filters that may have been applied to this URL are also redirected:


Important to note: Static mappings should always be implemented on the staging site and will be replicated to the development and production environments.

SFCC Resource on static mappings for more info.

Dynamic Mappings

To implement or manage a URL redirect via this method, navigate to Merchant Tools > SEO > Dynamic Mappings.

The mappings follow the same syntax rules as static mappings. The main difference being that you can use multiple wildcards; these can exist at either the start, or the end of the redirected URL, or both. 

Dynamic mappings cannot be used to redirect static resources to a destination point. To do this, you must use a static mapping or a URL redirect.

SFCC Resource on dynamic mappings for more info.

Important to note: With both static and dynamic mappings, if there are conflicting rules i.e. two rules created to redirect the same URL, the last one in the list will be the one that will be used (reverse order, last to first).

In Summary

Crawling and indexing considerations are essential for eCommerce sites, and SFCC features a number of tools to help  optimise crawl budgets and reduce index bloat. Some key takeaways:

Static & Dynamic Mappings: Use to implement redirects for legacy URLs with specific syntax rules; use static mappings for static resources and dynamic mappings for more complex patterns.

Sitemaps: Auto-generated in SFCC; used to manage URL priorities, set frequency for sitemap regeneration, as well as configure settings for hreflang tags, locales, and pipelines.

Robots.txt: Controls crawler access to site URLs; a minimalistic robots file is recommended, with custom configurations for production sites.

URL Redirects: SFCC offers multiple redirect methods (URL Redirects, Static Mappings, Dynamic Mappings), each of which are tailored to different needs, with some built-in functionality for automatic redirects.

Thanks for reading!

I hope you’ve enjoyed part one of our masterclass series on Optimising Salesforce Commerce Cloud (SFCC) Sites for SEO. Read part 3 to discover how you can optimise key on-page SEO elements using SFCC tools for both manual and automated implementation, as well as a hybrid approach.

If you need help configuring your SFCC platform to meet your SEO needs, please get in touch with us today via our website.

Make sure to join our blog mailing list to get updates when this goes live in a few weeks time!