0:00
/
0:00

Welcome to the problem fractal

Details, my dear person, details…

(Like this article? Read more Wednesday Wisdom! No time to read? No worries! This article is also available as a podcast). You can also ask your questions to our specially trained GPT!)

Welcome to the New Year 2026. If this year is going to be anything like the previous years of my working life, I will spend a lot of time explaining to people that things that sound really simple are actually quite complicated and that two things that look alike are in fact quite different.

Welcome to the wonderful world of the “problem fractal”.

A fractal is of course a structure that has an infinite amount of detail. No matter how much you zoom in, new details keep appearing, ad infinitum.

By the way: Fractals also have an attribute called “self-similarity”, which means that smaller parts of the shape resemble the whole. However, for the sake of the essay, I will elegantly forget about this particular attribute because it does not fit my carefully designed narrative 🙂.

The first time I ran into the “problem fractal” was when a manager at a bank I was doing a contract with told me that they needed a “Java resource”. In the ensuing conversation I was hit by a bolt of lightning when I figured out that, for this manager, two resumes that both contained the word “Java” were the same and that they didn’t really care which candidate we’d take on. Apparently, two “Java resources” were interchangeable.

This is of course not true. Even worse, depending on the shape of your problem, an experienced software engineer who doesn’t know any Java might be a better fit for the team and the problem than an engineer who superficially knows a bit of Java but doesn’t understand how computers work or who cannot debug if their life depended on it.

Personally, I blame the word “resource” for this. When I hunt for medicinal plants in Horizon Zero Dawn, they go into the “resource bag” in my inventory and one plant is as good as any other for upping my health. People, on the other hand, are not resources; they are, well, people. Each person is a unique piece of jewelry from a very limited production run and they cannot be separated easily into useful categories on binary attributes such as “knows Java”. It is for this reason that I hate the term “Human Resources” and will therefore at any conceivable opportunity refer to that department as “Human Remains”. This, by the way, also seems to do more justice to the “corporate whitewashing” and “summary execution” mode of operation that many modern HR departments seem to be engaged in these days. A few good people notwithstanding, your HRBP is usually not your friend.

But, I digress.

I come across the “I need a Java resource” style of thinking quite often when two problems in computer science sound the same but are in fact of a completely different nature, usually due to scale (which, as I have argued before, is the only problem left). Recently, I have been given an opportunity to work on vulnerability management and part of vulnerability management is patching applications and operating systems, typically by upgrading to newer versions of the software. In the course of that work, I have had to have multiple conversations with people to explain why this is a hard problem to solve.

When the UK’s then Prime Minister Harold Macmillan was once asked what the most difficult thing about being Prime Minister was, he apparently answered: “Events, my dear boy, events.” In the same vein, when project managers ask me why upgrading the kubelet on a few hundred thousand machines is hard, I answer: “Details, my dear person, details.”

Many people are familiar with the concept of a software upgrade, having pressed the “Upgrade now” button on the Microsoft Updater or on their iPhone. When they do that, some stuff gets downloaded, the machine maybe reboots, and presto, everything is working again. With that experience in mind, how hard can it be to upgrade the kubelet? The answer, of course, is: “Very hard”, but to explain why that is the case, you need a pretty deep understanding of software supply chains, back- and forward compatibility, kubelet configuration, Kubernetes in general, differences between cloud providers, request routing, infrastructure-as-code solutions, and whatnot. It is possible to develop an infrastructure where upgrades like these are easy, but that requirement has to be built in from the start and that is very seldom the case.

I have written before about a popular web site I worked for where the weekly release was a shitshow of the first order. Every Tyr’s day, senior people from the SRE and SWE side of the house would assemble in a war room at 9:30am and spend the day building a new version of the application, rolling it out, and dealing with the many bugs that popped up. The bugs were there because this was the first time that the new code that was merged in the week before the release was subjected to live Internet traffic. Every week, this effort ran into the wee hours with pizza and beer. Why was this such an uphill battle every time? Details, dear person, details!

Eventually, we got a team of release engineers on the case and they spent a year building an all-singing-all-dancing release train with canarying and automation. Once this was in place, the weekly update was reduced to a button push and an SRE keeping an eye on the release dashboard.

Still totally the wrong thing to do obviously. Over time we implemented alerts that paged us when the release was stuck somewhere, which freed up valuable time for playing pool.

The release was much easier now; everybody happy! Though, everybody? No! In the first few months of this new release procedure we had to have regular conversations with very unhappy software engineers who had just found out that they could no longer merge their PR on Monday evening and expect the code to go out with the release on Tuesday, because the deadline for the week’s release was Monday morning 9am, which left us some time for canarying the release candidate on a small percentage of user traffic. It just goes to show: It is very hard to make literally everybody happy.

The fractal nature of problems makes it so that two problems that look the same at a high enough level, are completely different at a deeper level, which is for instance why we have dozens of different storage solutions that all sit in a different spot in the requirements space. Especially when it comes to data storage, one size does not fit all. Too many people ignore the fractal nature of reality and attempt to fit a problem onto a basic solution that is not a match. Staying in the storage space: I have seen many attempts to put a SQL interface on top of Bigtable and that always ended in tears.

On the other end of the spectrum are people who use the fractal nature of problems as an argument to build a completely bespoke solution for literally everything. “Our problem is so unique”, they say, “that we have to write our own $thing.” I have seen this argument be used by teams to build their own cross-region replicated cache, database, workflow system, XML parser, or container orchestration system. None of these projects ended well. There are definitely problems that are so unique that you need to write your own whatever it is, but before you do that, please realize that you are setting yourself up for a huge amount of work for the lifetime of your system and do you really have the time, money, and organizational willpower to do that?

When Google got going, they were solving problems that had never been solved before and so they required some unique and new solutions such as GFS, Chubby, and Bigtable. They could have tried to run Google Search, Maps, and GMail on Oracle RAC and network attached storage with NFSv4, but they correctly surmised that they needed something better. It was a brilliant decision and it gave them a huge competitive advantage. But, and this is important: They had the people, money, and resolve to put that money where their mouth was! If you are a six-person team solving some business problem, you probably do not need to write a new workflow system with its own domain specific language. Yes, it is cool to write that kind of software, but that is why we have spare time and open source!

The real challenge with the “problem fractal” is deciding at what level to stop descending down the fractal and figure out which details to ignore, which requirements to drop, and which inefficiencies to live with. I am pretty certain that most workloads could do better with a custom operating system, because Linux’s memory management and scheduling strike a balance that more or less works for different workload types and therefore might make assumptions that might not hold for you. That said, few people write their own operating system but instead live with whatever Linus gives them, which is probably a very smart thing to do.

Anyway, here’s to 2026! May this year give you everything that you hope for, except when that conflicts directly with whatever I am hoping for, in which case I am afraid we will all have to give precedence to my realistic and very equitable hopes for all mankind.

Happy New Year 2026!

Bag your first new year’s resolution and subscribe today!

Discussion about this video

User's avatar

Ready for more?