Building my first headless CMS: what I wish I knew at the start

The advantages I didn’t realise and the pain points I learnt the hard way

Published on
Apr 16, 2019

Read time
6 min read

Introduction

I recently finished my first decoupled web app, using WordPress as the backend and React as the frontend. I’m writing the article that I wish I had at the beginning of the process.

In it, I hope to cast light on 12 questions I had when I first set out on this project. I’ll share several lessons I learned along the way, as well as several pain points to look out for and a handful of resources I found useful.

What is a headless CMS?

In short, it’s a content management system like WordPress, Drupal or Contentful that is separated from the frontend. The main advantage of this approach is that it’s technology agnostic, meaning the frontend website can be built using whatever tools or languages you like. The main disadvantage is that you need to manage two web apps instead of one.

Why use a CMS at all?

If non-developers need to manage content on a site, they need some way of doing this. By using a CMS, you’re saving yourself a lot of time versus building your own content management solution.

Because many CMS solutions are free and open source, they’re also inexpensive. Many come with useful features out of the box, which means that often you don’t need to spend much time at all configuring the back-end.

Lastly, for better or worse, people are averse to change. Often, they’d rather use a CMS they already know than spend time learning a new or custom system. That’s one reason why WordPress may continue to thrive, even as the industry moves over to “headless” or “decoupled” CMS systems.

What are the advantages of a headless or decoupled CMS?

To me, the advantages significantly outweigh the downsides. Here are some of the most useful:

  • Headless or decoupled systems promote the agile work methodology, allowing content creators and developers to work simultaneously. This can reduce time to market.
  • It provides a high level of flexibility, with content creators able to use their favourite CMS and developers able to use their preferred front-end technologies. By contrast, all-in-one solutions force everyone to use the same system, and they can often feel cluttered because they attempt to offer everything out-of-the-box.
  • Data can easily be repurposed for any number of channels (mobile apps, smart watches, voice apps, and so on), thus future-proofing content: developers only need to build the new frontend interface for each new interface. For the same reason, multiple websites can be managed using a single CMS.
  • CMS systems are typically cloud-based, making them highly scalable, while content is usually provided through a high-performance CDN (Content Delivery Network), which lessens the risk of DDOS attacks.

What are the disadvantages?

As far as I’m concerned there are only two main disadvantages:

  • First, there can be a higher barrier to entry in terms of knowledge: developers need an understanding of both the CMS (and the languages that underlies it) and their chosen front-end technologies.
  • Second, you end up managing two websites rather than one, so your operational workload goes up — at least initially.

How many options are there?

A lot! At the time of writing, headlesscms.org has 30 open-source and 30 closed source CMS options listed on its homepage. This list isn’t exhaustive, as traditional CMS systems like WordPress, Drupal and Magento now have decoupled options available, but aren’t included.

What’s the difference between a “headless” and a “decoupled” CMS?

Technically speaking, there’s a slight difference between a “headless” CMS and a “decoupled” CMS:

  • A decoupled system prepares content for presentation and then pushes into the delivery environment — it is “active”. The data from the backend is published somewhere, regardless of whether calls are being made to the API or not. This is how it works if you use WordPress or Drupal’s REST APIs.
  • A headless system sits idly until a request is sent for content — it is “reactive”. The data from the backend is only accessible via calls made to the API, as it doesn’t come with a frontend or presentation layer built-in. This is how a more modern CMS like Contentful works.

The most important feature of both systems is that they make content available via API endpoints, and so many articles on the web use the terms interchangeably.

By contrast, a traditional monolithic CMS combines the frontend and backend in a single website. This is how CMS providers like WordPress have traditionally operated, and that approach requires the frontend and backend to be written in the same language (in most cases, PHP).

Do I need two domains?

That’s the simplest and most flexible solution. If you’re using PHP for the back-end and JavaScript framework like React for the front-end, you’ll likely need completely separate hosting services.

But even if not, the steps which might allow you to combine the frontend and backend under a single domain (such as using an “alias” on your web server) may also reduce some of the flexibility that made you choose a decoupled architecture in the first place, preventing you from easily switching out either the frontend or backend at a later date.

Subdomains should work fine, so you could have admin.mysite.com to host your CMS alongside mysite.com to host your frontend without too much trouble. But if you’re open to the idea of one backend controlling multiple different frontends, a completely separate domain may be the best bet.

Do both sites need an SSL certificate?

A site with an SSL cannot fetch data from a site without an SSL certificate, without special permission given by each individual user. If this is the case, Google Chrome will block any pages with fetch data, using the not very helpful “an unexpected error occurred”. This caused me a headache in my own project, so I hope you can avoid the same error!

Should I sanitise my fetch data?

It’s likely that you’ll end up using dangerouslySetInnerHTML to parse incoming HTML markup, and so you should make sure to sanitise your incoming fetch data. Using this property without putting sufficient checks in place can expose your site to cross-site scripting (XSS) attacks — that’s why it’s dangerous!

The risk of someone tampering with the input data on your CMS will vary depending on what system you’re using, but for any professional project, data sanitisation is a step worth taking seriously.

How can I sanitise my fetch data?

If you’re building a JavaScript app, one simple and effective solution is the XSS node module. You can install it by typing npm i -s xss into the terminal, then simply import it into any JavaScript file where you want to sanitize HTML data:

import xss from "xss";

Let’s say we wanted to sanitize HTML data in the variable content. We can simply use xss itself as a function, and pass in the relevant variable like so:

<div dangerouslySetInnerHTML={{ __html: xss(content) }} />

I like this node module for its simplicity — it’s just one line of code and a three-character method — but there are other options: two popular node modules for data sanitisation are DOMPurify and XSS-Filters.

How should I protect secret data stored in the frontend code?

This problem isn’t specific to headless CMS systems, but rather to any frontend-only website. It comes when sensitive information such as API keys or login details has to be stored on your front-end rather than on some kind of server; that’s because, by default, front-end code is publically available.

The best solution varies depending on what technologies you are using to deploy your serverless app. For example, using Netlify would allow you to configure API gateways by using its custom Lambda functions. If your host doesn’t offer something similar, you may be forced to add serverside code to your app.

If I’m using a decoupled CMS, what should I do with the frontend presentation layer?

This is a question I’m still pondering for my WordPress-React project. I’m not relying on the in-built WordPress front-end for anything other than fetching JSON data and accessing the REST API, so I’m considering requiring a login to access all the data on the WordPress front-end.

In my case, there’s no sensitive data I’m concerned about on the WordPress frontend, but restricting access to anything that is there will add another layer of defence, and it will also prevent search engine crawlers accidentally picking up duplicate content.

Conclusion

I hope this article has been helpful for people who are considering using a headless or decoupled CMS, as well as those who have already taken the leap!

If you’re interested in learning more about my specific combination (WordPress and React), then check out my guide on How to create a modern web app using WordPress and React. It’s got a load of PHP and React snippets which should be useful for anyone approaching that combination of technologies for the first time.

© 2024 Bret Cameron