Ask an Engineer: Is It Ever Time for a Code Rewrite?

It’s been blamed for the downfall of companies from Borland to Netscape. Serial entrepreneur Steve Blank calls it “startup suicide.” Stack Overflow CEO Joel Spolsky claims it’s the “single worst strategic mistake that a software company can make.”

It’s been blamed for the downfall of companies from Borland to Netscape. Serial entrepreneur Steve Blank calls it “startup suicide.” Stack Overflow CEO Joel Spolsky claims it’s the “single worst strategic mistake that a software company can make.” What is this killer mistake? It’s the code rewrite, and it has a terrible reputation in the tech community.

To be sure, rewriting your app or website’s entire code base is a huge undertaking. However, there are definitely occasions when a code rewrite makes perfect sense. Maybe your existing code is written in an older, less flexible language that simply won’t do the things you need it to do today. If your app is small and just getting started, rewriting all its code from scratch might not be such a large overhaul.

Likewise, very large companies can sometimes absorb the costs of a rewrite without skipping a beat. Rumor has it that Amazon, Apple, and Foursquare have all, at one point or another, rewritten large chunks of their source code. If your company can survive the transition, it is definitely possible to gain huge cost savings over time by rewriting a system to be more efficient and to use fewer computing resources under the same load.

If your company is neither very large nor very small, though, how do you weigh the pros and cons of such a major change? Here are some tips, based on my own experience.

Tip #1: Beware of engineer bias

My company, Carousell, is a mobile classifieds marketplace with millions of unique users who access our app from hundreds of different devices. As with every fast-growing startup, one of our biggest challenges during our earlier years was our ongoing struggle with technical debt as we built and shipped new features for our users. The more changes we made to our mobile apps, the harder it was to develop new features on our backend API services. Our website and app started to run slower and slower. Pretty soon, the scuttlebutt in the engineering department had it that we needed to start over from scratch. We needed a code rewrite… or did we?

The ability to tinker with new technology is why I became a programmer—and a code rewrite is a tinkerer’s dream. It is a big red reset button that sweeps away all past mistakes, and leaves you with a blank slate on which you can test new technologies. And therein lies the problem. Try as we might to be objective, engineers always have a vested interest in rewrites. The fact is, it’s much more fun to write new code than to read and debug code that already exists.

In my role as an engineering manager, however, I also had to think about the rewrite from a management perspective. Most developers (myself included) want to work with new technology, but if the technology is too new and untested, it can be buggy and unstable—which can lead to more user experience problems than it’s worth. New technology ultimately has to benefit users, not just interest my employees. Which brings me to my next tip…

Tip #2: Be mindful of hidden time costs

In exchange for the freedom of rewriting our code, our company would have to give up the most important currency that startups have: time. As Wonolo CTO and Chief Data Scientist Jeremy Burton wrote in a great blog post on the subject, spending time on a code rewrite means spending time not updating your current code base, which could give your competitors time to pull ahead. A code base rewrite will also result in downtime while you’re testing new technology, which would be a worst possible case for users. Even worse, like any tech project, code rewrites can also end up taking much longer than expected, so any of these downsides can easily mushroom out of control.

Some of the time costs of rewrites are less obvious. For instance, many of the “fine hairs” sprouting from your existing code are bug fixes; they may look messy, but they represent the results of years of testing and fine-tuning based on user feedback. If you rewrite your entire code base, you’ll be starting this fine-tuning process over from scratch. (After all, no code base—no matter how elegantly written—is 100% bug-free.) If you’re trying to estimate how long a code rewrite will take your engineers, include the time spent debugging as well as just writing new documentation. Your bottom line will thank you eventually.

For us, the dealbreaker ended up being downtime: we simply couldn’t afford it. If our app didn’t work consistently, our large user base would quickly flock to another, more reliable app. Instead of a code rewrite, we needed a middle-of-the-road solution—and one that could be implemented with minimal testing.

Tip #3: Explore incremental alternatives

As we opted not to pursue a code rewrite, we decided to resolve our technical debt by moving towards microservices—a modular and highly scalable approach to software development employed by companies from Netflix to Amazon. Instead of maintaining one monolithic code base, we would reorganize our code into smaller, independent modules that could be updated one at a time. This new architecture would allow us to support new features and resolve existing technical issues without rewriting the entire code base.

To support our plans to move towards microservices, we migrated our existing backend service to the software containerization tool, Docker. At three years old, Docker wasn’t the newest, most exciting technology around—but it had been thoroughly tested by other companies, and most of its bugs identified and resolved. That meant we could deploy it with relatively little risk. (It also helped that it was open-source, and therefore free for us to access.)

After testing, we discovered we liked Docker combined with Kubernetes, an open source software designed to manage containers like the ones that Docker is based on. By putting Docker and Kubernetes together, we built a custom solution that now serves all of our production traffic—all with minimal downtime while preserving most of our existing code. This custom solution will enable us to incrementally upgrade parts of our codebase using microservices in the future, with no visible downtime or impact on users in the event of an unsuccessful rollout.

How do you decide?

There’s no one-size-fits-all solution to technical debt. If your business is established with a lot of legacy code, even our strategy of co-opting an open source solution might be too risky for you. For a smaller, less established startup, it might be too piecemeal. You have to decide what your company’s priorities are, and then proceed from there. In our case, we decided to value minimal downtime for users over the excitement of rebuilding our software from scratch. Your case may very well be different.

In an ideal world, however, this weighing process doesn’t become a push-and-pull between what’s exciting for developers to work on and what’s good for users. Instead, creating the best possible user experience becomes part of what makes the developer’s job intellectually stimulating. I know that’s how it feels to me.


With contribution from Lauren Orsini of the Hippo Thinks research network.

This article first appeared in AlleyWatch on 6 March 2017.  

  1. pruss one for vixtor ger!

    Like

    Reply

  2. Kent Beck just said something relevant: It’s not more features now vs cleanup now. It’s features now vs features later. Don’t act shocked when it’s later and you chose “fewer”.

    Like

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s