Taming Incidental Complexity
Software development is a complex trade. Layers of abstraction, tooling, patterns, trade-offs, dependencies, and people, are some of the reasons behind hard things in Software development. But there is also a big source of complexity: ourselves and our poor choices.
Incidental complexity is the technical term for this kind of complexity. It is defined as anything in software that is hard but does not need to be. It’s when we shoot ourselves in the foot and make our lives harder for no reason.
Almost every single project or company out there has some degree of incidental complexity in it. I’ve seen plenty of it myself and also caused a good deal of it too. I’ve analysed it and tried to understand its sources by talking to people, asking questions and seeking to understand the historical developments behind a system or company.
After some study, I have come up with some suggestions to avoid shooting ourselves in the foot regarding making things complex.
Avoid Coupling
Almost every single mess of a system finds itself in that state of messiness due to coupling. Coupling, in my opinion, is the most dangerous thing in Software Development. It’s like a camouflaged predator waiting to strike his prey: you never see it until is too late.
Part of the reason why we don’t see it is because coupling is not a bad thing in and of itself. What I mean by this is that coupling two things together is not a sufficient requirement for disaster. It is a necessary one, of course, but not enough by itself. The missing ingredient in the mix is change. Coupling only shows his nasty face when one of the things that have been coupled changes.
This makes identifying coupling something rather tricky, and you don’t usually realise you are making a mistake until something is hard to change. And it may be that even when that happens, you rationalise it by blaming it on some other external factor (“Oh the client keeps changing the requirements”). But truth is that we have brought that upon ourselves by making our software hard to change.
When you are building something, at all times you need to ask yourself the question: “Is this coupled to something?” and if it is, “Is this thing likely to change?", and it if is, “How costly would it be to change it?” Those are some of the most important questions you can ask yourself when you are building something.
Whether you couple your application to a particular database engine (I’m talking of you, Active Record), couple your code by using inheritance instead of composition or couple your suite of microservices using REST instead of an event-driven approach, you are going to have a hard time when things have to change. And if things change, you will pay the price of that change. There is no escape.
Systems do not need to be coupled. It is mostly our naiveness and inexperience that causes a system to be a tangled, hard-to-change mess. Keep an eye on what is coupled to what and have measures in place to remedy or mitigate the impact of a change.
Avoid Over-Engineering
Over-engineering is the habit of solving a problem with more engineering than necessary to solve it. In other words, when a simpler approach would have been possible to solve a problem but instead a more complex way is preferred.
This is particularly true of code generation tools. There is a lot of value in generating boilerplate code, don’t get me wrong. But this is usually what an IDE is for. In my experience, there have been very few times in which I have felt the burden of writing a class by hand. True, it would have been nice to be able to autogenerate code from a spec or something at that moment, but that feeling quickly evaporates when I remember that autogenerated code is often opinionated, ugly and outdated.
As a side note, I think that abstraction over code generation is a better approach to solving the problem of writing less code. Metaprogramming is an excellent approach for these kinds of problems. Although one can argue that writing code that leverages metaprogramming is inherently complex, the result ends up being a simpler API surface. So yeah, it is hard to do, but easy to use! I would say that is exactly the goal. Code generation tooling is hard to write, and not as easy to use (both the tool to generate the code and the generated code itself).
But, there is a world of difference in considering to autogenerate code for a massive API than when you have to do it for a small one. I still remember the frustration I experienced when I was forced to set up autogeneration for all the models of an integration I was building, even when many of them had two fields at most! It took me more time to set up all the autogeneration fluff than it would have taken me to write the models by hand. But well, the policy in that project was to do it that way.
When there is a simpler way of achieving the same result, go that route. Don’t try to get clever just for the sake of it. Yeah, sounds pretty cool to write your routing library for this project, but, do you need to do that?
Testing is another source of over-engineering. People build these stateful mock servers because apparently, they need to test third-party APIs using the HTTP protocol as if someone would not have tested that already. In memory, mocks are much simpler and don’t cause you the pain of setting up extra dependencies every time you want to test your application.
Simplicity is a rare jewel these days, not just because it is hard to find, but because sometimes it disguises itself as its shallow cousin: convenience. But simplicity tends to be found there where pragmatism trumps dogmatism. Make sure you are focused on practical stuff when solving a problem, and no over theorising it and trying to justify complex approaches based on remote possibilities.
Avoid Centralising
Organisations and systems evolve and grow more and more complex. This is natural and expected. What is not natural or expected is that, sometimes, we want to tame that growing complexity (that is, make it easier to manage) by using some form of centralised solution.
The problem with centralised solutions, especially in distributed systems or companies, is that by claiming a global benefit, they cause specific harm. Centralised solutions remove autonomy and create friction. They operate by removing the control of something from the subsystem or department where its function is defined, takes place and evolves, toward some central place that blurs the particulars. This dramatically impacts the further development of the thing to which control has been surrendered.
Take software documentation, for instance. Companies that choose to centralise documentation in a single place (a company Wiki or something else) usually do so by building a narrative of the benefit of having all knowledge in a central place of convenient access but at the cost of affecting the writing of the documentation itself. Oftentimes, due to the distance between the owner of the documentation (the code repository) and the central Wiki, these former ends up being completely outdated and seldom used.
There are other ways in which centralising is extremely dangerous. The central part becomes critical and a potential single point of failure or trouble. This happens with a central database used by many microservices services, or a central service where every business activity has to go check something before anything else happens (think of centralised access control platforms).
I think that as long as this central thing is disposable or invisible, then it is a good central thing. In other words, when the central thing is not the source of truth because the truth originated somewhere else. Think about Git, for instance. It has a central repository that everyone uses to coordinate, but the changes and the work happen in your local machine and then you push them, causing at least one person to always have an up-to-date copy of what is in the central repository. Git is a distributed system by nature and design, not a centralized one.
Same thing with documentation. If we need a place to solve the problem of information visibility, then let us solve only that problem. What prevents us from having an automated process push the documentation of a repository to this central wiki, instead of writing it in a separate place? The problem is not the writing of the documentation, so why that has to change? The real problem is its visibility of it: let’s make it visible, without removing ownership (More about this in the next topic).
When centralising something, make it work as a repository. Not as the source of truth (this is, change), but as the place where all the changes made elsewhere go. Every technology that has any sort of centralised repository works this way: docker
, composer
, npm
, git
. Changes happen next to their source, and then they are published for the rest of the world.
If you make the repository the place where things change, then you are going to have a hard time keeping it in sync with the actual thing that changes. Synching is an added pain that you don’t need to bear. It is incidental complexity.
Avoid Solving Pseudo-Problems
I call pseudo-problems problems that are not clearly stated. You see, most of what people call problems are just a preferred approach over small friction that usually touches the real problem but does not address it completely.
Take this centralised wiki as an example. If the problem is access to the information, then solve just that problem. If you force everyone to write documentation far from the system that documents, it’s going to cause another problem.
Problems need to be clearly stated before solved. Otherwise, you might be solving the wrong problem, or even worse, apparently solving the problem but causing others in the process.
This is too familiar a story, and it is such a painful one. An organisation has a monolith that is coupled, hard to change, buggy, and hard to make sense of, all the possible bad things that can happen with a monolithic system. Then someone comes and says “We need to split this into microservices”, and then they proceed to break the components of such monolith into different REST APIs. So, say that it was an e-commerce application. Now we have an orders service, a customer service, a fulfilment service, a payment service, and so on.
Nice! Every microservice responsibility is now clearly delineated. I would say that is a win, right?
Well, if you think about it, maybe not. What was the initial problem? Were the responsibilities of the monolith not clearly defined? Maybe. Was it the main problem? I don’t think so.
When software is hard to change, it is usually due to coupling. Back to point one here. Did they solve the coupling? I don’t think so. Now services are coupled via a network protocol (and not an in-memory routine), changes are less transparent (different teams work on different services now) and changing one service would impact the other in ways that are harder to spot (now we rely more upon logging). I would argue that this left them in a worse place, as managing changes across teams and distributed deployable units is harder than in a traditional in-memory application.
The solution should have been refactoring the initial application slowly to take care of the technical debt that made it hard to change and reason about.
This happens way too often and with many things. People throw tech at problems, not engineering. As a result, they think they solved a problem when what they did is just go around it and create some more in the process.
The only way I’ve found to fight this insanity is to refuse to solve a problem until it is clearly stated and demonstrated it is a problem and why. Only then a reasonable discussion about potential solutions and their benefits and drawbacks can happen. Solving the wrong problem is one of the worst sources of incidental complexity.