Switching to a New Data Structure with Zero Downtime
A Practical Guide to Schema Migrations using TypeScript and MongoDB
Published on
Feb 2, 2025
Read time
7 min read
Introduction
If you’re lucky enough to be working on a project with a large and active user base, one of the biggest mistakes you can make is using the wrong data structure. A database is, of course, one of the fundamental building blocks of most applications, and—if the data is stored in a format that makes it difficult or slow to work with—then you can easily end up adding complexity to your code and making your app slower.
But, for a live app with real users, the process of switching from one data structure to another can be complex. If you take shortcuts, you risk downtime and unexpected errors. Recently, my team and I have been working on optimising our database structures—moving data between tables, restructuring schemas, and even migrating data across databases. In this article, I’d like to share our approach to handling data structure changes safely and efficiently.
Our typical strategy follows these steps:
- Create the new data structure alongside the old one.
- Write to both data structures simultaneously.
- Migrate historical data from the old structure to the new structure.
- Switch read operations from the old to the new structure (potentially behind a feature toggle).
- Deprecate and remove the old structure once everything is stable.
This method is more time-consuming and complex than would be required if we were working on a new app with zero users or if our changes were only going to affect a particular feature with low usage. If your application can tolerate temporary downtime, it may be possible to make all the changes in one go.
But for an app with a large, active user base, following a process similar to the one above can help ensure zero downtime and a smooth transition. If uptime is critical, this process is a safe and effective choice.
Advantages of a Gradual Approach
A gradual approach to data migrations offers significant advantages, particularly in terms of safety and risk mitigation. By implementing changes incrementally, we gain the ability to easily switch back to a working version if something goes wrong. This is crucial when dealing with large production systems, as unexpected issues can arise despite thorough testing.
But what’s the problem downtime? We’ve probably all used web software that is unavailable due to some kind of update or maintenance work. (In my experience, the web apps of old, established banks seem particularly likely to rely on scheduled downtime for large updates!)
The problem is, down can have a tangible impact on a company’s ability to generate revenue and serve customers. If your users expect your application to be available at all times, even short periods of downtime can lead to frustrated customers, lost transactions, and damage to your reputation. And using the right approach means downtime is often unnecessary.
Instead, a carefully managed schema migration ensures that:
- Customers experience no disruption while the changes are rolled out.
- Errors can be caught and addressed early, without affecting the entire system.
- Rollbacks are simpler, as the old system remains in place until the migration is verified.
- Teams can monitor and validate changes in production before fully committing to the new structure.
Now, let’s consider a real-world scenario inspired by a recent change we made at work.
Example: Merging Related MongoDB Collections
Let's say we have a form builder app using MongoDB with two collections:
FormUi
—contains the UI structure of a given form.
Form
—stores validation rules, metadata, and a reference to the FormUi
via the ui field.
These collections have a 1:1 relationship. We originally separated them to avoid unnecessary data loading, assuming we would often need validation data without UI data. We were concerned because, even using $project
, MongoDB will load an entire document into memory.
However, in practice:
- We almost always needed both.
- The UI data wasn’t large enough to justify separate storage.
- MongoDB isn't optimised for
$lookup
queries, but it performs well with nested objects. - Debugging became harder, as engineers had to cross-reference two collections.
By analysing our actual usage patterns, we realised we could improve performance and developer experience by embedding FormUi
data directly into the Form document.
Current Schema
Here's what we started with, represented via Typegoose classes. To keep things simple, I'll use unknown
instead of providing the full types for the validation
and metadata
objects.
import { prop, getModelForClass, Ref } from "@typegoose/typegoose";
import mongoose from "mongoose";
class FormUi {
@prop({ required: true })
template!: string;
}
class Form {
@prop({ required: true })
validation!: Record<string, unknown>;
@prop({ required: true })
metadata!: Record<string, unknown>;
@prop({ ref: () => FormUi })
ui!: Ref<FormUi>;
}
const FormUiModel = getModelForClass(FormUi);
const FormModel = getModelForClass(Form);
Target Schema
This is what we want to end up with. Even on the schema level, it’s already simpler. But this new schema will eventually give us an app that’s both more performant and easier to maintain.
class Form {
@prop({ required: true })
validation!: Record<string, unknown>;
@prop({ required: true })
metadata!: Record<string, unknown>;
@prop({ required: true })
template!: string;
}
const FormModel = getModelForClass(Form);
Step-by-Step Migration Process
1. Create the New Data Fields
We'll start by modifying the Form
schema to include both the old reference and the new template
field, ensuring backward compatibility.
class Form {
@prop({ required: true })
validation!: Record<string, unknown>;
@prop({ required: true })
metadata!: Record<string, unknown>;
@prop({ ref: () => FormUi })
ui!: Ref<FormUi>;
@prop()
template?: string;
}
We should also confirm that our indexes will continue to support queries efficiently. In this case, nothing needs to change!
2. Start Writing to Both Structures
Next, we'll update all write operations to update both FormUi
and Form
simultaneously.
async function updateFormUi(_id: mongoose.Types.ObjectId, template: string) {
const baseForm = await FormModel.findOne({ ui: _id });
await Promise.all([
FormUiModel.updateOne({ _id }, { $set: { template } }),
FormModel.updateOne({ _id: baseForm?._id }, { $set: { template } }),
]);
}
Depending on your setup, it may be easier to override base library methods, such as changing FormModel.updateOne
and FormUiModel.updateOne
to both (temporarily) write to both collections. It's more complex, but may be save time versus changing every place updateOne
is called.
import mongoose from "mongoose";
import { FormModel, FormUiModel } from "./models";
// Store the original updateOne method
const originalUpdateOne = FormModel.updateOne.bind(FormModel);
FormModel.updateOne = async function (filter, update, options) {
// Call the original updateOne method
const result = await originalUpdateOne(filter, update, options);
// Check if we're updating the `template` field
if (update.$set?.template) {
// Find the related Form document
const form = await FormModel.findOne(filter);
if (form?.ui) {
// Perform the dual write to FormUiModel
await FormUiModel.updateOne(
{ _id: form.ui },
{ $set: { template: update.$set.template } }
);
}
}
return result;
};
3. Backfill Historical Data
After that, we'll use a migration script to iterate over existing FormUi
documents, embedding them into Form
. Here's an example which processes 1,000 documents at a time, filling in the new template
field on our FormModel
:
async function backfillForms(batchSize = 1000) {
let hasMore = true;
while (hasMore) {
const forms = await FormModel.find({ template: { $exists: false } })
.limit(batchSize)
.populate("ui");
if (forms.length === 0) {
hasMore = false;
}
await Promise.all(
forms.map((form) =>
FormModel.updateOne(
{ _id: form._id },
{ $set: { template: form.ui?.template } }
)
)
);
}
}
4. Switch Read Operations to the New Structure
When that’s done, we’ll want to modify read operations to use the new structure. To allow easy rollback, I recommend using a feature toggle. This can add a fair bit of complexity to the code, but it’s only temporary, and gives use the reassurance that we can easily switch back to the old system if the new system has issues—giving us time to resolve any problems properly in our own time, without the panic of a live production bug!
Here's an example where we ensure template
is always populated:
async function getFormWithUi(_id: mongoose.Types.ObjectId) {
let form = await FormModel.findOne({
_id,
});
if (!form) {
return null; // Handle case where no form is found
}
if (toggles.useLegacyForm) {
form = await FormModel.findOne({
_id,
}).populate("ui");
// Ensure template exists
if (form.ui && !form.template) {
form.template = form.ui.template;
}
}
return form;
}
Note that Mongoose's populate method does not use MongoDB's $lookup
behind the scenes, instead making another query to the database. Either way, we get a benefit moving to the new system.
Similar to the override of FormModel.updateOne
above, we could also override FormModel.findOne
as well, using an approach like this:
import { FormModel } from "./models";
// Store the original findOne method
const originalFindOne = FormModel.findOne.bind(FormModel);
FormModel.findOne = async function (filter, projection = {}, options) {
// Call the original findOne method
let query = originalFindOne(filter, projection, options);
if (toggles.useLegacyForm) {
query = query.populate("ui");
}
// Ensure template is always included in the result
const form = await query.exec();
if (form && form.ui && !form.template) {
form.template = form.ui.template;
}
return form;
};
As with the updateOne
override, this approach adds complexity, but it may be preferable to updating every instance of FormModel.findOne
.
5. Remove Old Data and Code
Finally, once everything is stable, we can remove the legacy code and collection, ending up with a simpler, cleaner code base!
We should:
- Delete outdated methods (
getFormWithUi
). - Remove support for reading or writing to the old collection, including any code that runs when
toggles.useLegacyForm
is true. - Remove
ui
references fromForm
. - Finally, drop the
FormUi
collection.
await FormModel.updateMany({}, { $unset: { ui: 1 } });
await FormUiModel.collection.drop();
Conclusion
And that’s a wrap!
Together, we’ve learned a structured approach to modifying data structures without downtime. While this example was relatively simple and abstract, the same core principles also apply to more complex cases (and to other tech stacks).
In summary, here are my key takeaways:
- Analyse actual usage patterns to refine your schema.
- Ensure backwards compatibility before switching over.
- Use feature toggles to mitigate risk.
- Perform staged migrations with minimal impact.
- Finally, clean up legacy structures once stability is confirmed.
I hope you found this helpful — thanks for reading!
Related articles
You might also enjoy...
How to Automate Merge Requests with Node.js and Jira
A quick guide to speed up your MR or PR workflow with a simple Node.js script
7 min read
Automate Your Release Notes with AI
How to save time every week using GitLab, OpenAI, and Node.js
11 min read
How to Create a Super Minimal MDX Writing Experience
Learn to create a custom MDX experience so you can focus on writing without worrying about boilerplate or repetition
12 min read