Process – rollback with splits

Consider this scenario, your application has a utility field that is used by all form elements that take the form of a dropdown, and this amounts to about 200 consumers. This utility field is considered legacy, it does not follow new gen practices / use frameworks that have recently been adopted by your development team.

You get assigned a task to migrate this utility field. Now imagine you have completed the migration, your peers have reviewed and approved the implementation, and QA have tested for regressions. No regressions were found, so eventually this change lands in production.

Two weeks later you are informed that one of the users (a company) of the platform has raised a critical issue on production, and investigation points to your recent migration work as the culprit. This has been raised as a critical issue because the issue is costing the user $$$ for every minute it remains. You decide that the most pragmatic solution is to rollback now, then investigate a fix later in the comfort of the development environment. This normally looks like this:

  1. push rollback commit to dev branch
  2. get an approval from peers
  3. rerun regression testing on development
  4. merge change to release branch
  5. run a hot fix deployment to production
  6. rerun regression testing on production
  7. communicate and document the rollback

This isn’t ideal, there are many steps here, and depending on how long things like deployments take, this might be far from a quick fix. It may also involve efforts from multiple departments like development, QA, devops, management etc.

Let’s consider a different approach, what if we put the migration work behind a split? A split is essentially a realtime opt in/ opt out wrapping that can be added to a segment of code. So in this case we could create our migrated component, and instead of deleting the old version and replacing it with the new version, we would keep the old version and put the mechanic that switches to the new version behind a split.

So let’s run through the same scenario and see how many steps we need using this approach.

  1. update split change in realtime
  2. rerun regression testing on production
  3. communicate split change

Not only was this achieved with less than half the steps, but is also probably about 100 times faster, with much less that can go wrong. The split change also does not require a developer, the split could easily be updated by an engineering manager, or even a more product/ UX focused team member.

If you like the sound of this approach, checkout platforms like Unleash and Split that can help you get started.