Table of Contents

Introduction

Did you know that our very first mobile app version from 5 years ago still works seamlessly today?
Maintaining backward compatibility in backend systems is crucial for user retention and system longevity.
This article will explore our strategies, challenges, and solutions for achieving this.

Background

History

We launched version 1 of our app in 2019. It was an incredible feat for the team, which at the time, was very lean.
We were just three engineers then; Bosun (cofounder and our then CTO who worked primarily on the backend, but also worked on both the web and the mobile), Ridwan (that’s me - I also worked primarily on the backend but also moonlight as a web / mobile engineer) and Opeyemi, who primarily worked on the web and mobile. Since then, many great individuals have joined and left the team, the team has grown surprisingly large but the work still feels the same (I still sit at my computer everyday hacking away with as much passion and excitement if not greater).

Initial Architecture

Our backend was a monolith. It was a rebrand of the cashestate backend. Cashestate was the precursor to Rise. I joined the Cashestate team to maintain and build out the new Rise backend alongside Bosun.

It was written in TypeScript (Node.js, Express) and followed an MVC architecture. We churn out solutions (and problems) at incredible speeds with several changes being introduced at short intervals. We all know how chaotic things can be at that stage. Before long, we realize how important it is for us to standardize our APIs. Yes, we were moving at great speed but so were we breaking things. New releases are breaking old ones. Or, the new ones themselves are not working.

#

Strategies for Backward Compatibility

Testing

A good friend to any budding startup is speed and we nailed it. It was a pretty small team but everyone understood the importance of the value we are providing to our users.

However, like I mentioned earlier on our initial architecture, the system became brittle. This called for a regroup and the first step was to improve our software testing. Yes, we test everything we release. But, testing was always an afterthought, even though we know its importance. As long as it satisfies the happy path, we are good. There was no time to waste.

We became more intentional about this and laid down rules to ensure every single thing that gets built into our backend system was thoroughly tested. Till date, we have maintained that culture. We do a combination of unit and integration testing while prioritizing the latter for obvious reasons; systems have components and while it’s important to test to ensure each component is solid, it’s more important to test how these components work together since your system components won’t run in isolation in a live environment.

Mocha was our testing framework of choice coupled with chai as the assertion library. We also started doing a lot more CI tooling. From Jenkins to TeamCity to GitHub actions. Every change that gets pushed to our remote version control triggers our tests to run and we can see if that change breaks anything and arrest it before it makes it out there. We also leveraged code coverage tools. We used codecov to see how much of the codebase our tests cover.

API Contract Stability

While testing did improve the quality of the features we rolled out, it didn’t solve the problem of new features breaking current flows for users on older versions of our app. This is another thing we knew was important but given the speed at which we were moving, we didn’t think much of the repercussions. Telling our users to simply update their app is not a solution since for many reasons, some users just can’t. Some users have a space or memory issue on their mobile devices. Some have personal app store issues. Some users just won’t update.

We again regrouped and started to look at our APIs more like contracts. It’s a contract between the backend systems and frontend systems. Whatever we roll out that breaches this contract is to be called back immediately. This led us to building APIs that in OOD (object oriented design) terms, is open for extension but closed for modification (Read about the Open-closed principle of the object oriented design principles by Robert C. Martin aka Uncle Bob). This means the APIs are extendable but any thoughtless modification to the public contract is prohibited. The result of this is stability both backwards and forwards.

API contract stability for me just can’t be stressed enough. This is something that we wished everyone in our space paid more attention to. We do this insane amount of work to ensure we continue to delight our users and partners but unfortunately, we don’t get this level of intentionality from our partners when we are on the usage end of other people’s APIs. One most recent example that comes to mind was late 2023/24 when a provider we leverage to provide services to our users changed a synchronous API to async without prior communication. We had built our solution around this instantaneous API where we provide value to our users based on a success response from the provider. Unknown to us, they changed this API to be async which means there’s a chance that even though we got a success response from them, the transaction could still have failed on their end. Because of this, we had some users with bad intent who took advantage of this situation, got value where they shouldn’t have and started milking money out of thin air. An API is a contract and should be treated as such. If there is to be an unavoidable modification to that contract, proper communication has to happen and a new contract (API) agreed on.

Deprecation Policies

As an extension of the API contract stability, we also ensured that we deprecated APIs thoughtfully. Realistically, building systems that are open for extension but closed to modifications are only feasible if the system was properly designed with this mindset from the outset. There will come a time when that API definitely has to be modified but instead of modifying that API, we instead mark the previous one for deprecation while it continues to work and then we create an entirely new API to support the new use case. Doing this allowed us to further deliver happiness to our users regardless of the version of the app they are running.

Versioning

Late 2021/22, we had arrived at a point where we needed a complete overhaul of the entire Rise product. We needed to build a v2 of our backend system. Several questions came to mind ? Do we deprecate the old APIs and build new ones and then force everyone to move to the v2 app which is an entirely different app (totally rebranded) ? Do we leave the current APIs intact and run it in parallel with the v2 APIs ? How do we ensure sync ?
In the end, we prioritized our users and decided to do the work. We left the v1 APIs intact. We rolled out the new v2 APIs and then built an API adapter that can safely interpret the old APIs inputs, forward to the new system engine, collect output and transform to what the old APIs expect as response. This is an oversimplification of the work we did of course. It was a lot of work but it was rewarding work. Countless sleepless nights but when we rolled it out safely, we couldn’t be more proud. I personally wasn’t the same engineer (I had achieved a demigod status 😁).

Challenges

Ensuring backwards compatibility is a lot of work and if it wasn’t something that was baked into the system from ground up (like in our case), things can get ugly very fast. You end up with a codebase with different components that are doing similar things. You are responsible for ensuring that your different systems or your different components of the same system stay in sync. You pile up more and more technical debt and before you know it, you find yourself dealing with a hot mess.

To address these challenges, we of course did more work but now with intent to build a maintainable and scalable system. Below are a couple of things we did

  1. Refactoring: we regularly refactor the codebase to improve its structure and reduce complexity. We break down the system into smaller modular components that are easier to understand and maintain.
  2. Strangler pattern: the strangler pattern is a pattern in which components of a system are replaced bits at a time by new systems or services until the whole system is fully migrated to a scalable and maintainable architecture. We adopted this and not only have we been able to have more confidence in our system, it has also paved the way for the needed business expansion across the globe i.e. scale. This pattern has its pros and cons and requires some expertise to get it right but if done right, the benefits far outweigh the cons. When rebuilding or migrating systems, taking smaller incremental risks is always better than taking a giant one. Read this Redhat article to learn about the pros and cons of this pattern.
  3. Code deduplication: we created an internal shared library that houses common functionality across components and systems.
  4. Monitoring and Logging: things will go wrong. You however want to make sure you stay on top of things and to do this, you need visibility into your systems to be able to diagnose and fix issues quickly. We make use of tools like Prometheus and Grafana for this.

Conclusion

Systems are built for a purpose and while this purpose might not necessarily change, new constraints will evolve, new flows will evolve. However, if other systems are dependent on yours, it’s your responsibility to ensure that you keep up your end of the bargain. Your APIs should be treated as contracts and should that contract ever need to be amended, think long and hard about your users and implement alternatives that will ensure backward compatibility but if still unavoidable, proper communication is important.
We learnt a lot and still have much to learn.