Sitemaps with Sitecore Headless SXA


There is built in functionality in Sitecore SXA to create Sitemaps, it is actually working quite well even though I didn’t find it very clear initially.

Configuration

There are a few places where the Sitemap is configured

Within Settings section on the SXA site (when the Site metadata SXA module is enabled) there is a Sitemap setting item, so this is easy to find.

Here you can configure how language versions should be handled (very important for discovering of the site).

This is also the place where you configure how many pages should be within a single sitemap when using Sitemap Index. The recommendation from Google is that a single Sitemap should be no more than 50MB or 50,000 URLs.1 That leaves around 1KB pr. url so I think that should be ok. I prefer to have a smaller number as things just seems more reliable also in our own part of the site.

Screendump of Sitemap settings in Sitecore

Here we can also configure how the sitemap file as stored. While using SXA in a MVC site the sitemap can be stored in memory. However for headless sites that is not the case, according to the Sitecore documentation for XM Cloud regarding sitemaps:

Storing sitemaps in media items is required to deliver the sitemap fields to the rendering host through the Expereince Edge service2

There is also similar configuration on the Settings node it self of the SXA site:

Sitemap related settings on Settings node

This is the place where you can activate sitemap indexes, which is required if you need additional external sitemaps or if you need to split your content to multiple sitemap files according to the requirements above.

Host names

I would like to also point to the configuration of host names the “Site Grouping” under settings where you can configure Target hostname. To have correct hostname in the generated sitemap files we needed to have this configured properly. The Site Grouping configurations allows multiple items so you can create multiple items for each environment.

Site grouping configuration

Publish!

It is important to publish the site.

The sitemap items are generated when the site is publish so if you are wondering how to get an actual sitemap, this might be the reason.

Generated sitemap index on a site

Hereby you should be able to get sitemap according to the configuration. Here it is a headless JSS site hosted in our own Kubernetes cluster, when using XM Cloud the urls will be for the Experience Edge.