The Architecture Was the Bug: Refactoring Legacy C++ to Modern Value Semantics

4.5.2026

Two bugs. Two full days lost. The answer wasn't another patch, it was cleaning up the architecture. A case study.

By Tim Varelmann

‍

This case study is shared with permission from Lucky Data.

‍

Results at a Glance

A whole class of recurring bugs is now architecturally impossible. Fewer surprise incidents during development, more predictable delivery.
Less & simpler code to maintain means cheaper and faster development from here on.
- Nearly half of the most bug-prone code is gone, with more functionality than before. Major downstream code simplifications in addition
- A whole auxiliary group of classes became obsolete and was removed entirely. That's ongoing maintenance work that no longer has to happen.
Simpler code is trivially ready to support streaming from backend as opposed to current bulk-loads that slow down users.

‍

About the Client

Lucky Data is a German IT company providing IT services. They also develop logistic dispatch software that modernizes how dispatchers plan logistics in the construction industry and at inland ports. Bluebird Optimization supports this development.

This article describes a refactor of heavily pointer-based legacy C++ to modern value semantics: a change that eliminated a whole category of bugs, cut the size of the most bug-prone code almost in half, and laid the groundwork for a faster, more responsive user experience. For a business whose software makes high-stakes operational decisions every day, that combination (fewer defects, faster to change, faster for the user) shortens the path from new customer requirements to shipped features.

‍

For months, development had been humming along. New features went out in regular releases. Nothing dramatic, nothing on fire. The kind of stretch where you start thinking about what's next rather than what's broken.

Then came early March.

‍

The First Bug

It showed up in code that was still under development: a stubborn bug with a reproducible symptom and no obvious cause. I spent a full day chasing it, and by evening had nothing but a patch: a piece of code that eliminated the symptom without explaining it. With a release on the horizon and other things still undone, I decided to postpone investigation.

That evening was uneasy. A full day had disappeared with nothing to show except a fix I couldn't fully justify. When you can't explain why a bug happened, you can't really be sure it's gone.

‍

The Second Bug

A week later, another one. Different surface, same feel: hard to pinpoint, symptoms that didn't line up cleanly with the code supposedly producing them. Almost another full day, another patch, another symptom silenced without understanding. The release was imminent. The patch went in, and the release shipped.

Once it was out, there was finally room to breathe: and to actually investigate.

‍

Stepping Back

With the pressure off, the picture came together quickly. The two bugs weren't independent. They shared a family resemblance, and that resemblance pointed at the architecture itself: a design built around pointers, with shared access to data spread across large parts of the code. Each bug had its own immediate trigger, but all of them were drawing from the same well.

The uncomfortable conclusion: this architecture had no real justification for what the software actually does. It was a carelessly chosen default in the past, never reconsidered. And it was producing bugs that would keep coming back under different names until the underlying structure changed.

‍

A Short Primer: Stack, Heap, and Why Pointers Are Hard to Reason About

To explain what changed, a quick detour through how programs store data:

Computer programs keep data in memory, and memory has two main regions: the stack and the heap.

Data on the stack belongs to one specific piece of code: the function that created it. Only that function can read or modify it.

Data on the heap lives on its own. Any piece of code with a pointer (essentially an address telling the program where the data lives) can read or modify it. The same piece of heap data can have many pointers aiming at it from many parts of the program.

Why does that matter? At first glance it sounds fine: if two parts of the program both change the same piece of data, surely each has a good reason. And individually, yes, each change usually does. The problem isn't individual intent. The problem is that developers can no longer reason locally.

Here's an example: Imagine code that decides whether truck 42 is available at 3pm. It reads the data: "truck 42, free", and starts assembling a dispatch order. Between the moment it reads and the moment it commits, another part of the program, also for a perfectly valid reason, marks truck 42 as under maintenance. The dispatch code has no way of knowing. The truck gets assigned anyway.

In isolation, both pieces of code are correct. The bug lives in the space between them. And debugging it means tracing every part of the program that might hold a pointer to truck 42, which in a mature codebase can be dozens or hundreds of places.

That doesn't mean pointers are bad. Anyone who has installed a browser extension, a Microsoft Office add-in, or a game mod has benefited from them: that whole category of "extend the running program with something it didn't originally know about" depends on pointer-like mechanisms. Pointers are the right choice in the right place.

For the central data store in question though: the piece that holds whatever data is currently on screen or in its temporal neighborhood: there was no such reason. The complexity of pointers was pure cost, no benefit.

‍

What We Found in the Legacy Code

Three things stood out.

Too many pointers, none of them justified

The codebase reached for pointers as the default tool, in a part of the code where there was no case for them. The result was exactly the cost described above: shared access to data with no clear owner, lifetimes that had to be reconstructed piece by piece, and the space between any two accesses as a potential bug.

Mutexes as scar tissue

‍Mutexes are coordination mechanisms for code that runs in parallel: they prevent two threads of execution from stepping on each other's data. But currently, this part of the software runs on a single thread. Every mutex in it had been added, at some point in the past, as a patch to a bug that looked like a race condition. They were leftovers from old firefighting sessions, still slowing the code down.

Inheritance where none was needed

The code used an abstract superclass as the only way to reach the data for the two kinds of things the software tracks: assignments (the scheduled events of a dispatch plan) and entities (the things that participate in those events: trucks, concrete mixers, skilled workers, helpers, products to be delivered). Inheritance is a mechanism that lets different kinds of things share an overarching category. It's useful when you genuinely need that unified treatment, but in C++ it practically forces developers toward pointers: treating different types through one shared category requires them. It also adds runtime cost and makes the code harder to follow.

‍

The Refactor

Two decisions did most of the work.

The central data store now holds its data by value

‍When a part of the application needs an assignment or an entity, it asks by unique ID and receives a copy. The copy belongs to the caller (and is stored in the caller's stack). Nobody else can reach in and change it. When the caller is done, the copy simply disappears: no bookkeeping, no leaks, no surprises.

Composition replaced inheritance

There's a well-worn piece of guidance in software design: prefer composition over inheritance. Instead of assignments and entities sharing a common ancestor, they now each contain a small common piece (the properties they genuinely share) and are otherwise independent types.

The overarching category that used to be the only way to access this data is still available for code that genuinely needs to treat both kinds uniformly: it's now implemented with a modern C++17 mechanism called the visitor pattern, which developers don't even have to know about in order to use it. But crucially, it's no longer the only door. Parts of the application that know they're dealing only with assignments, or only with entities, can now ask for exactly that. Less iterating over everything and filtering afterwards. More direct, more readable.

‍

What Changed in Practice

A class of bugs is gone. Not rarer: impossible. Lifetime problems, silent data-corruption-through-shared-state, symptoms that smell like concurrency bugs in non-concurrent code: all of them required the old architecture to exist. They don't have anywhere to land anymore.
Almost half the lines of code in the core data store's implementation vanished, even as the feature set grew. What remains is code a new developer can read and understand without first building a mental map of who else might be touching what.
Downstream code got simpler. The data provider that feeds a central calendar view lost roughly a quarter of its code. Other UI models followed the same pattern at smaller scale.
UI updates became more targeted. Notifications about changes to assignments are now separate from notifications about changes to entities. Components that only care about assignments no longer need to subscribe to updates about entities and then filter them out. Less noise, faster response.
An entire auxiliary class hierarchy was deleted. The previous design needed special classes just to carry incremental data updates through the interface. Those are gone - the easiest-to-maintain classes are those that don't exist :)
Faster launches are now in reach. With the pointer coordination removed, the application is ready to consume data streams from the backend. Once the backend side ships, users will see their first screen of real data almost immediately, instead of waiting for a large initial payload to load.

To put it briefly: bugs fixed and structurally prevented going forward, the software easier and cheaper to maintain and extend, and features that looked out of reach are suddenly almost ready. All achieved in less than two weeks of focused work.

Rolf Ruß/Patrick Wolff, CEO of Lucky Data assesses this development as follows:

"Quote from Patrick/Rolf"

‍

Patching Is Borrowed Time

Two lost days while a release was close were expensive. They also turned out to be the cheapest possible price for the lesson. Patching a bug without understanding it feels productive: the symptom goes away, the release ships, the open-items list gets shorter. What it actually does is borrow time against the next incident. The second patch, a week after the first, was the receipt.

The real work was taking the architecture seriously as a source of bugs. Identifying the unsuitable aspects of architecture and giving them the time they need to be replaced: that's where the leverage is.

‍

If you recognize some of this in your own codebase: bugs that keep coming back under different names, parts of the code everyone is a little afraid to touch: I'd be glad to talk. Reach out directly, or subscribe to Bluebird Briefings to stay on top of optimization and engineering topics like this one.

‍