Originally published at Lux Group Tech Blog
Back in Feb 2017 I wrote about our decision to rebuild our entire architecture. It is established lore among the software community that re-writing software is something you should never do. This is not only because of Joel Spolsky’s excessive influence in software-land but because so many software engineers have experience of hating the system they work on, only to find that when they re-wrote it, they simply didn’t make it that much better.
There are some well-established ways of avoiding the re-write from some of the luminaries of the software world but sometimes you have to come to the conclusion that there’s nothing worth saving. At the Lux Group we were faced with a legacy platform with no discernible architecture, no automated tests, an outsourced engineering team and an overwhelming level of tech debt; we saw no real alternative and suggested the following justifications for doing a strategic rebuild (and doing it properly):
A successful rebuild is normally undertaken under the following conditions:
* The company has gained enough experience to understand its business domain, the customers it is serving and the product that it wants to offer.
* The company is well-resourced, able to invest in more experienced engineers and to invest those engineers in building a product or platform for the future.
* The company stakeholders understand that building a product with solid engineering principles, built-in checks and balances and a high-performing team to run this software takes more time and tends to cost more than doing Rapid Application Development (RAD) in the same way that building and maintaining an architect designed home tends to be far more expensive to build and maintain than buying a kit home and renovating it yourself.
* The stakeholders understand that Software Engineering is Expensive and building things to last makes this even more expensive, therefore the company and team needs to be very selective around what features they opt to build.
It is now over a year later and the product luxuryescapes.com has been released without salvaging a single line of code from the original system (CRM aside). We release code every day, fix bugs before writing new features and the small amount of tech debt we have is under control. In retrospect we can see that the rebuild project could have easily spiralled out of control, the following were some of the principles we applied to ensure it didn’t.
Strong Product Team
We have this policy within our team: “if a feature doesn’t make sense to you, don’t do it”. This might sound blindingly obvious, but engineers and designers are often asked to work on stories and features that don’t make any sense to them. They cannot see why the customers would want that feature or how it benefits the product or the business. A strong Product Team can challenge senior stakeholders and sponsors to get to the root of the problem or design a holistic solution rather than applying band-aid after band-aid. An empowered engineer can expect a good explanation of the strategy under-pinning any features they are asked to deliver.
Architecture without an end state
In architecture without an end state, Michael Nygard describes how we should build architectures that are designed to be ever-changing along with the personnel and direction of the business. Lux Group is an ambitious company; by deciding to use micro-services we embraced the dynamic nature of the business and the ability to scale teams and iterate quickly. We embraced the challenges they created (reporting and atomic transactions), and avoided reverting to a monolith design when faced with these challenges.
When we sold some of our businesses, bought others and restructured the company and our business model, we had an architecture, team and process that was setup to embrace this change.
Iterative development and Continuous Delivery
From the first month of development we created an MVP that we treated as a production environment even though it was not live to the public. This MVP contained the ‘spine’ of the product (simple implementation of critical features like: search, add to cart, purchase, pay vendor). We only showcased software that had been released to that environment, and we released aggressively, with a Continuous Deployment approach. We treated every feature as an MVP, releasing early and iterating. When we failed, we demanded an RCA. When we finally launched the product, we already had the processes and discipline in place to continue the same practise.
Every month we showed progress to the sponsors and we were honest about our setbacks and failures. By being honest, the delivery team did not get caught behind a lie and the sponsors saw the true rate of progress and an opportunity to consider our options: scope, time, resources.
Challenge every requirement
When you do a rebuild you get a fresh start. The conventional wisdom suggests you’re going to end up rebuilding every feature you already had so stick with what you’ve got — but this was not the case for us. We had many features that had been built over many years and were no longer that relevant. We were keen to highlight that every feature has a cost of ownership: paying an engineering team for build and maintenance as well as ongoing operational costs. Often the cost of ownership does not justify the profit that will be made from this feature. By costing and challenging every requirement you can help keep your product tight and the business focused on what it really needs. This is beautifully explained in one of my favourite blog posts The Tax of New.
What could we have done better?
We thought we had an elegant plan for solving the inevitable reporting problem that comes from having the distributed data of a microservice architecture. It turned out this plan didn’t work and we had to scramble a tactical solution while we learnt a lot more about pub/sub event-sourcing and re-constituting distributed data. If you do embark on microservices, ensure you solve this problem as part of your early MVP.
Would I do it again?
We can feel confident that no one at the company wishes we’d tried to incrementally work with what we had. We now have in place a high-performing team with a great product and platform that is being extended to build the next chapter of the Lux Group’s story. Joel Spolsky and Martin Fowler are probably correct in most cases; if you can salvage parts of your system while you rewrite others, you probably should. However, some systems are unsalvageable so don’t be afraid to start again from scratch.