Every CTO with a legacy Rails app eventually asks the same question: do we rewrite this thing or keep patching it? I’ve been on the inside of that decision at scale. Domestika runs on thirteen years of Rails, and when we arrived the codebase mixed react-on-rails in one place, regular Rails views in another, separate React apps on the side, and internal libraries that reinvented frameworks Rails already had. Four architectures pretending to be one app. Delivery had stalled badly enough to bring in outside help.
If a rewrite was ever going to be justified, it was there. We didn’t rewrite it. Eleven months later deploys were 40% faster (measured on the pipeline, mostly from getting react-on-rails out of the build and cutting tests that tested nothing), the dead module was alive again, and the team that inherited the codebase could read it. The reason isn’t that rewrites are always wrong. It’s that “rewrite or refactor” is the wrong question, asked at the wrong level.
The app isn’t one thing
A thirteen-year codebase is fifty decisions wearing one repository. Parts of it carry the revenue and mostly work, other parts fight every change while earning nothing, and some are simply abandoned, which turns out to be a different problem than broken.
The decision that works is a map, drawn per part: what to kill, what to keep, and what to build. At Domestika that map looked like this. React-on-rails: kill, completely, because it taxed every deploy and every developer, and Stimulus could do the same job with far less machinery. The Rails core serving millions of creative professionals: keep, because it worked, and the risk of touching it exceeded any elegance gain. The Community module, abandoned for years: build, from scratch, because there was nothing worth saving and a business reason to want it back (a contest platform was waiting on it).
Notice that the map contains a rewrite. Killing react-on-rails and rebuilding Community from the ground up were rewrites, scoped to the parts that deserved one, done in conversions small enough to ship alongside normal work. The question was never rewrite-vs-not. The question was the blast radius.
What pushes a part toward “keep”
Revenue runs through it and it works, the people who understand it are still around, and the pain it causes is annoyance rather than stoppage. Most of a mature Rails monolith falls in this bucket, and the honest move is to leave it alone, even when it offends your taste. Especially when it offends your taste. Taste is how companies fund two-year rewrites of things that already worked.
The harder case is when revenue runs through it and nobody left understands it. That’s not a “keep” and it’s not a rewrite trigger either, because rewriting code nobody understands means re-deriving requirements from production behavior, which is the most expensive way to learn your own business rules. It’s a stabilize: characterization tests around the money paths first, then small changes by people who read code for a living, until the part is understood enough to earn a real verdict.
What pushes a part toward “kill” or “build”
The part taxes everything around it, not just itself. React-on-rails wasn’t ugly code that worked; it charged rent on every deploy, every test run, and every onboarding. Infrastructure that taxes the whole system is the highest-value kill you can make.
Or the part is abandoned and the business needs it again. Abandoned code with no owners and no tests isn’t a refactoring candidate, because refactoring assumes behavior worth preserving. When we rebuilt the project editor, the old one wasn’t consulted as a spec. The user research was the spec; Katarzyna took it from discovery through production rollout. That’s a build, and pretending it’s a refactor just slows it down.
While you’re deciding, look at the tests, because they’ll lie to you. Domestika had tests that existed for the sake of existing: green, plentiful, asserting nothing that mattered. We deleted those and wrote ones that catch real bugs. A green suite tells you nothing about which bucket a module belongs in until you’ve read what it actually asserts. I wrote more about that trap in How do you know the software is working?
One thing that does not belong on the kill list: the framework version. Being stuck on an old Rails is a maintenance debt inside “keep”, not a reason to rebuild the product around it.
When the full rewrite actually wins
It happens. If the domain changed so much that the old behavior is no longer the product, you’re building something new that happens to have a predecessor. If there’s no production traffic and no revenue at stake, the strongest argument for rescue disappears. And there’s a hard case I won’t argue away: occasionally the coupling is so dense that incremental replacement needs more shims and dual-maintenance than a clean rewrite behind a well-tested seam would cost. You find out which one you’re holding during the mapping, when you trace what the tangled part actually touches. If every seam you try to cut crosses ten others, the map will tell you.
What doesn’t justify a full rewrite: the new framework, the new architecture pattern, or the feeling that you’d do it better this time. Joel Spolsky called the full rewrite the single worst strategic mistake a software company can make back in 2000, and the mechanism hasn’t changed: the old system keeps moving while you rebuild, you maintain two systems, and the new one re-learns every lesson the old one already paid for. The per-part version of replacement has a name, the strangler fig pattern, and it exists precisely because it lets you get a rewrite’s benefits without betting the company on one.
How to actually decide
Read the codebase before deciding anything. Not a slide deck about the codebase, the codebase. Our version of this is a one-week decision sprint, two engineers: inside the repository, the deploy pipeline, and the test suite, plus calls with the people who lived there, and the output is the kill / keep / build map. Every shortcut version (an architecture review based on diagrams, a workshop based on opinions) produces a map of how people feel about the code, which is a different document.
Then make the smallest decision first. You don’t have to choose the app’s fate, just one part’s fate. Act on it and let the result inform the next choice. We killed react-on-rails before rebuilding Community, and what the first move taught us about the deploy pipeline made the second one cheaper. A legacy codebase took a decade to get where it is. You’re allowed to take it apart one decision at a time.