The Chilean Nerd

On Scaling Software Companies

Matías Navarro-Carter — Wed, 15 Nov 2023 18:02:31 GMT

There are many definitions of scaling around there, but I like this very simple one:

Scaling a business means setting the stage to enable and support growth.

The goal is growth (who doesn't want their business to grow?), however, the key idea here is that growth is something that needs enablement and support. The process of creating those supporting and enabling structures is called scaling.

In most traditional businesses, scaling is a linear equation: to get more output (growth) we need more input (people, raw materials, machines, refined processes, etc). In other words, growth is achieved through having more people and assets for production, and this relationship is directly proportional: more growth, means you need more people and assets.

This is obvious, for a business like McDonald's to grow, they need more restaurants in crowded areas, with proper machinery and the proper staff. That costs money, but makes more money than it costs: revenue is what matters.

The problem is that software companies are not your traditional company. They are a weird mix between a service and a product company. Software is, in some extraordinary ways, both tangible and intangible. The source code is tangible and it is produced once, but the result of it (the built software), makes it more like a service since it can be replicated and sold multiple times in a SaaS manner, for instance.

For these kinds of companies, scaling in linear terms is usually the reason for their ruin. If you are a CEO, a CFO, or a CTO of a Software company and think that to double your revenue you need to double your engineering team size, you are making the biggest mistake of your life, and it will cost you that company. This is because the engineering departments and the costs associated with it are probably the biggest hit to your revenue. Engineers are expensive (especially if you want good ones), and computing is expensive (especially if you have poorly designed software). You can't scale engineering linearly.

The best way of scaling a software company is exponentially: with an extra one or two engineers you can deliver 2, 5, or 10 times as much. This is not because you are suddenly hiring 10x developers, but because you have mastered the art of making what seemed complicated, simple; by means of continuous improvement.

A story of two platforms

I used to work for a company that was in the business of making integrations between different lenders and merchants. Version one of the platform was a monolithic application that had grown super complex over time due to poor coding standards, lack of testing, coupling, and other things. Now, the core idea of that platform was really good. It served as a hub where all the providers were implemented and setting up a connection between them usually involved copying a few classes and implementing a few interfaces, plus configuring some parameters like API keys and others. The tricky bit happened when something custom needed to be made for a particular integration and that wasn't supported by the abstractions that powered the model. But I would say that to overcome those difficulties it just needed a big refactor or probably a rewrite taking into account the new domain knowledge gained, but following the same underlying idea. The platform took you 90% there, you just needed that extra 10%.

Version two of the product was a suite of REST-powered microservices coordinated by a workflow engine. This was built by a team that was brought after some big round of investment. Now, every integration had a unique workflow, with unique steps and a unique UI. Technically, it solved the missing 10% problem of customization, but at a big price. Because everything was their own service, we spent ages writing glue code, API clients, and investing in expensive and complex acceptance test suites to make sure we had wired all the parts of an integration correctly, and that errors were handled properly and propagated accordingly. If that is hard enough to do in a monolith, imagine doing it in a distributed one. Distributing our services might have solved some issues, but it made delivery 10 times slower. To integrate something wasn't copying some classes anymore, but defining a workflow (it was complex enough!), its suite of services, deploying it, writing acceptance tests, and setting up observability, all this across 4 environments with multiple codebases.

Looking back, we should have gone the route that allowed us to do more with less effort. Sometimes that route covers 90% of the use case, and clever engineering is needed for the rest 10%. But I felt we gave up the 90% for a tenner.

The commercial results spoke for themselves. With its defects, the first platform still yielded most of the company revenue. The new platform had only one successful project that took almost two years and didn't provide any substantial increase in revenue, and that could have been done in the old platform anyway. It failed to deliver a big project due to overcomplexity caused by overcustomization. All this with an engineering department twice as big. Duplicated head count didn't yield the expected revenue growth, because someone got the vision and the strategy wrong.

The most important metric

This is why, as a CTO, your most important metric is delivery speed. How fast I can go from idea to production, is the most important thing.

Now, some CTOs know this, but they misread most important for only important. What I mean is that they focus only on delivery speed, but they never achieve it, because delivery speed is affected by a myriad of things that they are not worrying about.

Poor Developer Experience is probably the biggest factor. If you have a system that is hard to understand, hard to test, and hard to change, where to do something simple you need to go and figure out where Lt. Bello was lost, then your delivery will be slow. Or if developers spend a lot of time writing boilerplate code that could be automated, that's another time waste. In short, the easier it is for developers to do their job, the better. Investing in tools to aid with setting up environments, automating trivial things, and generating boilerplate is the best investment of time you can make apart from building actual features.

If your codebase is frustrating to work with, you'll have high rates of engineer turnover, which in turn will affect your delivery speed. Reduce incidental complexity. Strive to make things as simple as they can be, and don't let yourself be fooled by shiny trendy stuff. I've seen the SPA + Microservices revolution, and all the complexity it brought, and now we are going the way back. Just make sure that what you are doing works for you and your product and that you are always solving the right problems.

Testing is another area. You need to make sure your test suite runs as fast and reliable as possible. Slow and flaky tests are extremely bad for your delivery speed. And when something breaks, the why should be extremely obvious, and not some weird stack trace full of noise that no one can make sense of. You don't need a massive QA team if you develop a product that is easier to test. Extend your testing frameworks with custom stuff so you can write more tests with less code. For instance, this test in one of my projects uses a custom PHPUnit extension that resets the database when the SetupDatabase attribute is present. It also uses custom assertions for testing APIs.

💡

Note how we never assert the contents of the response, but its structure via JSON schema. This is a more robust test.

Sometimes, complexity makes its way in and there is no way back. But then investment in onboarding and training should follow suit. Make sure you take proper time to explain how your system works to new joiners, why things work the way they do, and who knows, maybe they will have one idea or two to improve things. And give them the time to do so!

Code integration, review, and deployment should be as automated as possible through a CI/CD pipeline. Make sure it runs fast and makes extensive use of caching. If it needs to be super custom and vendor-dependent, so be it. Is better that it runs fast on one vendor than slow on any vendor.

But above everything else, be a problem seeker and solver. Always ask yourself the question: "What is in the way of my team delivering faster, with less effort?". That should be the question in every refinement too: "How can we do more with less?". That's the key.

A theatrical analogy

There is probably nothing more irritating to CEOs than spending time on things that do not deliver product features. The words migration or refactoring terrorize them to no end. And to some degree, I understand. After all, you are not delivering any features while your competitors are. But sometimes, to keep on moving, you have to take a few moments to catch your breath.

Let's go back to the definition of scaling given at the beginning.

Scaling a business means setting the stage to enable and support growth.

I have the privilege of going to London often for work and every time I go I try to squeeze in a visit to the theatre to watch a musical. I've watched Hamilton, Come From Away, and a few others (thank you lottery tickets!).

Can you imagine a musical that does not have a stage? Would it be the same? Would it cause the same impact? Would it convey the same meaning and emotions? I don't think so. Every musical needs a stage. Setting the stage is an integral part of the musical's success, although that does not happen during the musical show itself. While they are setting the stage, they are not getting paid. Is when people go to see the result of that effort that the revenue comes.

In the same way, investing in refining to the maximum your delivery is integral to the success of your software company and product. It will take time away from product work, but it will pay off. A good, smooth, delivery process will always give you the upper hand against any competition you might have.

If you can do more than your competition with less (less costs, less time, less effort) then you probably have 50% of the battle won. The other half is just good ideas (Product), customer relations (Sales), putting yourself out there (Marketing), and good talent to bring into those (HR). But if your engineering cannot do more than your competition with less, then it does not matter how much you pour into the other areas: your company will stagnate sooner or later because you won't be able to deliver.

The Industrial Revolution

Scaling is about solving problems that enable growth, not just growing. If you solve the problems, the growth will come. Let me illustrate this with another example.

When I was at University, one of my favorite topics was the Industrial Revolution and its causes. I was so interested in it that I wrote a few articles about it for the Uni's history magazine, most of them expounding the thesis of an author whose name and book I've long forgotten - I tried to google a bit to see if I could find the book, but I couldn't.

if memory serves me, in this book, this author noticed that all the prime materials and knowledge available for the revolution were in Great Britain's possession at least 400 years before the Industrial Revolution took place. They already produced iron and had people who knew how to work it, they had coal and they could burn it in a controlled fashion, and most certainly had some experience with pumping systems and mechanical concepts like cranks, con rods, rotors other things (although all these systems were human or animal powered).

The author's thesis is that the revolution didn't start earlier not for a lack of materials, or knowledge or skill, but rather because demand wasn't high enough. Once demand outpassed supply, highly technical people started to press on to the objective of how to produce at a higher rate without employing twice as many people so they could meet the demand. They were trying to figure out how to deliver more, with less. That's why, once the main idea was developed, they improved upon and applied the steam machine concept to every possible thing under the sun: mills, water pumps, transport, and textiles.

This makes the Industrial Revolution a technical victory. It wasn't triggered by scientific people, but by skilled people who were tradesmen by profession. Thomas Newcomen, who improved Thomas Savery's steam machine, was a metal worker, and an itinerant preacher.

No one can deny that the Industrial Revolution has been the greatest scaling phenomenon the world has ever seen. At the core of it was the need to develop a production system that would enable the production of more things with fewer people, which would allow us to move faster, with less effort. That concept is key to innovation, and innovation is key to scaling and productivity.

https://youtu.be/xuCn8ux2gbs?feature=shared&t=890

Bottom line

The bottom line of all of this is surprisingly simple and repetitive. It all comes back to Agile. Move forward, look back, improve, repeat. If you keep on refining and improving, eliminating what makes you slow and unproductive, then future iterations will be more efficient and you will be able to accomplish more with less. If you see something that slows you down or blocks your team, even if that thing is the very product you are selling, stop and improve it. Spend some time setting the stage for what is to come. Build your steam machine so you can place it in your factory. It will pay off. It has always paid off.

To Array is Human

Matías Navarro-Carter — Wed, 27 Sep 2023 11:43:35 GMT

I was just browsing LinkedIn this morning (because yeah, that's what I do on holidays apparently) and I got to this post, for which the author got a lot of undeserved criticism.

I tried to jump in defense of the author, mainly because he was trying to explain a pattern that is valid in a variety of circumstances, and people were trying to criticize it by pointing at all the circumstances where the pattern is not valid. Following the metaphor of tooling in software, this is analogous to say "Don't use a hammer, because you can't turn screws with it".

One user in particular went on to write a blog post about this "anti-pattern". So well, because I also like to write and I'm in the mood to rant a little (I had too much relaxation time these holidays already) I decided to compose a response to that blog post. If you want full context, you might want to read both the LinkedIn post and the blog in response.

Identifying the Real Issue

The blog post is extensive, but the main ideas are that, quote "using array as a type hint is (usually) an anti-pattern". Says the main problem with using arrays as type hints, quote "is that arrays are a very broad type in PHP. They represent multiple, disparate data structures", and because of this, when you type hint array to a return function, quote "you never know what you are going to get".

Before anything else, a Senior Developer must clarify concepts and point at exactly the root of disagreement. Otherwise, we are doing a disservice to our profession and the people who heed our advice. So in light of that, I must ask. Is anything of what he said about arrays wrong? Not at all. All those things I quoted there from him are facts of life. They are not wrong.

The problem though, is that the issue discussed was not about type-hinting as arrays in general but type-hinting arrays in a very specific context and set of circumstances.

I can almost see the reasoning behind the critic's mind: "Arrays are bad, we shouldn't use them. This uses arrays, therefore is wrong". But an experienced person would say "Okay, I know arrays are bad for these reasons. Given those reasons, is this particular use of array as type-hint bad? Why? What are the pros and cons?" But there was none of that. We'll talk about why that happens in the second section.

The author explains, correctly I must say, that arrays are bad because they are a bag of states that can hold pretty much anything. Now, that something can hold anything is not an issue per se, but rather could become an issue (and most certainly will) when you pass-around that array to multiple functions and calls. This is the real problem with arrays. That's when is hard to keep track of the keys that may or may not be in your array, and the mutations that could happen along the call stack. This is the reason why this becomes a debugging nightmare when something fails. DTOs are much more appropriate structures to be passed-around application layers.

However, utility functions like the one demonstrated in the LinkedIn post don't fall into this category. Let's look at one of the examples I like the most. This beautiful, simple, and elegant function to cut a string in two parts.

/** * Cuts a string in two by the first occurrence of substring. * * The substring is not included in the result * * @return array{0: string, 1: string, 2: bool} * * @psalm-pure */function str_cut(string $string, string $substring): array{    $len = \strlen($substring);    $i = \strpos($string, $substring);    if (!\is_int($i)) {        return [$string, '', false];    }    return [        \substr($string, 0, $i),        \substr($string, $i + $len),        true,    ];}$a = 'foo=bar';$b = 'foo';[$left, $right, $ok] = str_cut($a, '=');// [0: 'foo', 1: 'bar', 2: true][$left, $right, $ok] = str_cut($b, '=');// [0: 'foo', 1: '', 2: false]

The function's return argument is type-hinted as an array. But, does this use of array as a type-hint present the problems previously mentioned with arrays?

No, for two simple reasons:

The function is pure. This means it has no side effects and given the same input, it will return the same output. In other words, the array is created inside the function, consistently. There is no possibility that this array can be in a state that we don't mean it to be.
The array is destructured immediately (it's the purpose of this API), so it cannot be passed-around. So the array type hint (which is not a problem in this case because it is stable) does not propagate to the rest of the codebase. It is meant to be used destructured. If you have ever used React (const [state, setState] = useState()) then you understand this concept very well.

Some of the pros of this approach:

You can name your variables however you like. This immediately tells other developers in the code what this value contains. So there is no ambiguity about the contents of the value.
You can benefit from type checkers like Psalm or PHPStan. Even PHP Storm will help you with this syntax. So it's pretty hard to get wrong.

Some of the cons:

If you don't have a modern IDE, you might need to look at the function documentation to be able to use it and see what you will get in return, otherwise, there is a chance you'll get it wrong. But if this is your case, I think you have bigger problems anyway.

Now, I'm not saying this is The Only Right Way a function like this could be implemented. You could use an associative array with named keys or even a DTO. If you want to do that, go ahead. If I were reviewing a PR with code like that, I would certainly not fail it. But I'd probably recommend against both of those approaches because they result in more verbose client code and have slightly worse (so small that it doesn't matter much) performance.

The Real Issue: Dogmatism

The real issue here is dogmatism. I define dogmatism as the failure to consider the context when applying a rule, a principle, or our own experience. In this case, there is a well-known rule in the PHP world, that comes from experience, that arrays are tricky data structures to work with, and if you litter your methods and functions with them and use them all around, you are most certainly going to have issues. I find it hard to find people nowadays who wouldn't prefer a good old typed DTO instead of an array as a data transfer mechanism between application layers.

The issue is when we take this knowledge, and make it an absolute rule with no context: "You shouldn't use arrays in your return methods". I believe this is the "tweet" mentality in action. In every programming debate, context is key. 99% of the answers to all engineering questions are "It depends". Most of our programming wars would be much more educational for the people watching them if we just consider the context instead of assuming the other is just plain wrong.

In this case, the context was completely ignored. People jumped in masses to attack someone for doing something perfectly fine, with little to zero drawbacks whatsoever. You wouldn't do it the way it was shown on the LinkedIn post? Good for you, but preference doesn't invalidate other approaches. If you want to say something is wrong or an "anti-pattern", you must provide evidence that applies to the particular use case: something that was completely missing from this discussion.

When you hear someone saying, "You should/shouldn't do this" the most important two questions you should ask are, first: "Why?" and once you have understood the reasons behind it, the second question is even more important: "Do the reasons apply in this particular case?".

I mean, that's pretty much all the wisdom you need to navigate the turbid waters of engineering discussions.

A CRUD Reality

Matías Navarro-Carter — Thu, 03 Aug 2023 11:07:07 GMT

The Nasty U in CRUD

Before anything else, let's just do a quick recap on what CRUD is. CRUD stands for "Create-Read-Update-Delete" which are, in turn, the four basic operations of data storage. It's usually used as an adjective to describe a system. For instance, I often apply this acronym to describe JSON APIs which are built with frameworks that enable all four operations with little to zero boilerplate code. These usually take your input as JSON, validate it, transform it, and apply it to the database with little effort. All you have to do is to define your models, extend a particular controller, write a bit of validation and you are good to go.

The fundamental issue with this approach is that is just merely providing a JSON over HTTP interface on top of a system of record. Granted, it has some authorization and validation logic added on top, but you are still just serializing JSON into the database and back.

This is usually enough for quick prototyping or in the early stages of a system. But for a mature application, with a complex and always-evolvable domain, this approach falls short pretty quickly, and all the speed you seemed to gain by using tools that do most of the heavy lifting for you, vanishes after you try to work around the architectural limitations of such tools.

In my own opinion, a domain starts to be complex when just having an interface to store and retrieve data from the database is not enough to fulfill customers' needs. Sometimes other things apart from updating the resource in question need to happen to fulfill a requirement. In other words, the moment we start shipping features whose wording is along the lines of "when x happens in the system, x, y, and z should happen too". The more you have of these kinds of requirements, the more complex.

This is the bottom line of the issue with CRUD. Creating, reading, and deleting resources are quite straightforward operations (actually, deleting is not so much but that could be another blog post!). However, updating a resource is much more than just replacing some fields in a table. "Account updated" is not too descriptive. We need to know what exactly has changed in the account, so just saying "something was updated" is not enough. We need to be able to say "a user was deactivated", "a ticket has been archived", "an order has been shipped" or "a refund has been processed". We need to start thinking in terms of what is called "business actions" and their effects. An update is not a business action, is just something that says "Hey! Some info about this entity was changed. I don't know exactly what so go figure yourself!" We need to build this log of business actions and their effects along with the operation. We need what is often referred to as Domain Events.

Substandard Solutions

When an application reaches this point and has not been architected from the beginning to support Domain Events as a first-class citizen of the domain model, two solutions are usually implemented that in my opinion are substandard, because they don't attack the root of the problem.

Change Data Capture

The first common solution is implemented at the database level and is called Change Data Capture (CDC). The strategy consists in using the replication protocols of the data store in question to receive state changes in real time to some other process (think Meroxa or Debezium here). That process then serializes the binary information received into a more universal format (JSON) and puts it in an event stream powered by Kafka or similar technologies. The result looks a bit like this (this is taken from Meroxa PGSQL connector)

{    "schema": {        "type": "struct",        "fields": [            {         "type": "struct",         "fields": [             {                 "type": "int32",                 "optional": false,                 "field": "id"             },            ...         ],         "optional": true,         "field": "before"     }        ],        "optional": false,        "name": "resource_217"    },    "payload": {        "before": {            "id": 11,            "email": "ec@example.com",            "name": "Nell Abbott",            "birthday": "12/21/1959",            "createdAt": 1618255874536,            "updatedAt": 1618255874537        },        "after": {            "id": 11,            "email": "nell-abbott@example.com",            "name": "Nell Abbott",            "birthday": "12/21/1959",            "createdAt": 1618255874536,            "updatedAt": 1618255874537        },        "source": {            "version": "1.2.5.Final",            "connector": "postgresql",            "name": "resource-217",            "ts_ms": 1618255875129,            "snapshot": "false",            "db": "my_database",            "schema": "public",            "table": "User",            "txId": 8355,            "lsn": 478419097256        },        "op": "u",        "ts_ms": 1618255875392    }}

The only good thing about the approach is that it doesn't need any modifications to the application source code, but that's as far as the benefits go. You still need to process the diff between the changes in your resources to figure out what was the actual domain event that occurred and extract some meaning out of it. For instance, in the example above, we can say that the email of this user was updated. Yes, you can glance at that and figure it out, but building an algorithm to do that and other changes for every entity in your system it's no easy feat.

I'm getting ahead of myself here but just imagine for a second what would be the ideal JSON you would like to work with here. I don't know about you, but I would love something along these lines (this is the actual structure from the events of one of my projects):

{    "event_id": 232,    "event_name": "user_email_changed",    "payload": {        "user_id": 11        "email": "nell-abbott@example.com"    },    "ocurred_at": 1618255874537,    "metadata": {        "performer": {            "type": "user",            "id": 11        }    }}

Auxiliary History / Log Tables

This is another approach that is a bit suboptimal. Sometimes, in domains where historical information and audibility are essential, the need arises for some history log features. For instance, in a project management system, we want to be able to see a history tab with all the actions that have occurred on a certain ticket, when they happened, and who performed them.

What ends up happening is that some evaluator is built on top of the update operation. It analyses what fields have changed and creates a record of the change in a history table. This overcomplicated code is usually inserted in some random places after some operation has been performed. Because of this, the operation is oftentimes not inside the transactional boundary (if any!), meaning that there might be a possibility that the update is performed but the log is not. Lastly, the saddest thing is that this is a missed opportunity: is something built ad-hoc just for that entity rather than a feature of the system architecture as a whole.

Enter the Transactional Outbox Pattern

First of all, I don't like the name of Transactional Outbox Pattern. I very much prefer to call it a Transactional Event Log. Yes, the purpose of it is messaging, but also auditing.

The first requirement of the pattern is that your application stops thinking about state updates as just mere updates, and starts moving to the notion of commands. A command is an action (in imperative) that is being performed in a system. As a result of that action, the state of the system will change. Examples of commands include:

Deactivate Customer
Archive Ticket
Publish Reply
Process Shipment
Issue Refund

So whatever interface you have in the front of your application (JSON API, CLI, etc). You need to make sure you are not receiving CRUD resource representations from your clients, but rather commands. If you take a look at specifications like Smithy from AWS, you know what I'm talking about.

Once you have the command, then your corresponding service will handle it. The service will perform the necessary state updates into the relevant table, but it will also add a record in the events table about the action that has been performed. This ensures both the record and the event are persisted atomically. This is an important detail of the pattern and hence the name "Transactional": consistency is key.

💡

Something I like to do along with persisting the state update and the events is persisting all the listeners that should run for that event (each listener in a row of its own), so another process can run them asynchronously. This effectively is like job processing done in a very particular way, but it's super powerful.

You end up with a table that contains the log of events that have occurred in your system, and then the interested parties can consume those events via a REST endpoint, directly from the database or from a message queue. The consuming mechanism is pretty much your choice.

One important aspect of the events is that they carry the state relevant to the event that has occurred. As I showed in the example ideal JSON for an event a few paragraphs above, the email has changed: we include the name of the email and the user id for the email that changed. This is what Martin Fowler calls Event-Carried State Transfer: the relevant state is included in the event.

The benefit of this pattern is that now you have a log of all (and not just some) the events that have happened in your system. The log is persistent and immutable: you can read it anytime you want and share it with other systems using whatever mechanism you prefer, and other systems just need to keep track of the last event they have processed. The log is consistent: you'll never have an event that does not have its corresponding state update. And lastly, the log is really useful as a means of auditing tool or to build history-like auditing features.

A PHP Implementation

I usually implement this in PHP using Doctrine ORM and Symfony Serializer. First, let's define an interface that your application services will use:

interface EventNotifier{    public function notify(Context $ctx, object $event): void;}

Notice how the events can only be objects. Also, using a context is very useful to not pollute the method with contextual information that may or may not be there.

Then, let's suppose our application service uses it this way:

/** * This is the event class */ class EmailChangedEvent{    public function __construct(        public readonly int $userId,        public readonly string $email        ) { }}/** * This is the command class. This class holds the data that we need * to fullfil this business action. */ class ChangeEmailCommand{    public function __construct(        public readonly int $userId,        public readonly string $email        ) { }}/** * This is the command handler class.  * This class deals with the business action and notifies the events. */ class ChangeEmailHandler{    public function __construct(        private readonly UserRepository $users,        private readonly EventNotifier $events,        ) { }    public function __invoke(Context $ctx, ChangeEmailCommand $cmd): void    {        // Some data transformation and validation        $canonicalizedEmail = Canonical::email($cmd->email);        Assert::email($canonicalizedEmail);        $user = $this->users->ofId($ctx, $cmd->userId);        // The model produces the event based on the state change        $evt = $user->changeEmail($canonicalizedEmail);        // Here the event is notified        $this->events->notify($ctx, $evt);        // The new state of user is saved        $this->users->add($ctx, $user);    }}

Both UserRepository and EventNotifier are abstractions behind an interface. Their implementation uses Doctrine under the hood, and a command bus middleware wraps this operation inside a transaction, so both the state change and the event get persisted atomically. UserRepository ofId and add methods are wrappers of findOneById and persist respectively. But let's look at the more interesting implementation of the EventNotifier with its corresponding Entity.

#[ORM\Entity]class Event{    #[ORM\Id, ORM\Column(type: "integer")]     public int $id = 0    public function __construct(        #[ORM\Column(type: "string")]        public string $name,        #[ORM\Column(type: "json")]        public array $payload,        #[ORM\Column(type: "json")]        public array $meta,        #[ORM\Column(type: "datetime_immutable")]        public DateTimeImmutable $ocurredAt = new DateTimeImmutable(),    ) { }}class PersistentEventNotifier implements EventNotifier{    public function __construct(        private readonly EntityManager $manager,        private readonly NormalizerInterface $normalizer,        private readonly EventNamer $eventNamer,    ) { }    public function notify(Context $ctx, object $event): void    {        $ormEvent = new Event(            $this->eventNamer->getName($event),            $this->normalizer->normalize($event),            [                // We store the class as metadata for denormalizing the event                'php.class' => get_class($event),                // We store other metadata we are interested in                'performer' => $ctx->value('performer');            ]        );        // Event is persisted        $this->manager->persist($ormEvent);    }}

The end result of this is a database table along these lines:

It's trivial to expose this table under a REST endpoint. In fact, I do this in one of my personal projects. Other systems that are interested in my system events can call the endpoint at their convenience to get everything that has occurred in the system. They just need to remember the last event id they have processed, and then keep on consuming from there using ?from= .

There is also a published_at column for publishing the event in a message queue or a streaming platform, which is something you can do too if you wish to push these events in real-time to consumers. You can fire these events on an SSE endpoint too. I mean, there are many, many possibilities.

A Warning on Eventual Consistency

Of course, any operations you perform after the event persisted are bound to be eventually consistent. This means, because they will be asynchronous in nature they will happen, eventually, but not atomically. So if some client or consumer is waiting for that state change somewhere else, that email or that notification, the worst-case scenario may take a while to be processed. But the good this is that it will because you have an immutable log of events.

However, this is when having an event-driven system pays off because clients who are waiting for something to happen can listen to these events and react to them when they do. UI notifications using SSE are a really good pattern to mitigate the effects of eventual consistency.

A Warning on Table Size

Another drawback is the potential table size and its impact on application performance. Let me offer three pieces of advice to deal with these issues.

First, your primary key type must be an 8-byte integer, this is bigint in MySQL and bigserial in Postgres. If unsigned, you have exactly 18,446,744,073,709,551,615 max capacity for records. You are never going to use that. Even if your primary key is just 4 bytes, you have 4,294,967,295 records until your max which is quite a lot. You'll probably run out of machine storage before hitting that max.

Secondly, we all know that as a table grows, full scans are very costly. The key here is that you should never do a full scan of this table, nor correct old records with update operations. Any derived state you need from this table should be computed by doing a sequential id scan and tracking the last id processed. This is a table that is meant to be processed, not queried. If you need to query, using a process that scans the table you can denormalize its information to provide a queryable representation somewhere else (this is a process known as projection in the Event Sourcing jargon). Also, it helps if you want to index the event name and the occurred_at fields, to provide you with some leeway for querying.

Third, if for some reason you do need to query and update your records (although I advise against this), you can always resort to archiving old records, especially if you have a system that is historical in nature, with "closing" cycles like accounting or legal. In those systems, some information is never touched after a while but still needs to be kept. So you can grab a bunch of it and archive it in a secondary form of persistent storage.

Conclusion

When an application is born it usually persists state changes directly to the database and the previous state is completely discarded. However, as the application grows in complexity is convenient to have a mechanism to react to those state changes in other parts of the system or in other systems. This will avoid too much coupling in the application, since in an event-driven architecture, systems need not know who sent a particular event: they just need the event.

Implementing the Transactional Outbox Pattern solves this problem very elegantly. It integrates well with the application code and doesn't suffer from the consistency problems of sending events or jobs directly to a queue. Plus, this event log you get can be used for a multitude of purposes like auditing, event streaming, and UI notifications.

The Repository Pattern Done Right

Matías Navarro-Carter — Fri, 07 Jul 2023 13:27:47 GMT

💡

This is an old article that I wrote a few years ago in one of my many blogs and got lost somewhere until someone mentioned it on Github. Since it has been useful to others, I decided to republish it here and give it a small update. Enjoy!

The repository pattern is one of the most well-established patterns in Domain Driven Design. Its origins can be traced as early as when Object Oriented Programming was born.

Of course, as it happens with almost every pattern or tool, you can use it terribly the first time (or even the second, or the third one). The only way to improve upon that is good literature and seeing other, more appropriate, uses of the pattern/tool. Refining your use of tools and patterns this way is, with almost all certainty, the only way to grow as a developer. Years of experience dont count much if you have been doing the same thing, the same way, over and over again.

This is why I implement and use repositories very differently now than the first time I started. This is probably because of the experience (both good and bad) that Ive accumulated over the years. Ive also read quite a lot on the topic and certainly, Im not the only one that has experienced issues implementing repositories in my applications.

So, over the years, Ive come to a definition of repositories, and is this one:

Repositories are a specific and immutable abstraction over a collection of domain objects.
~ Yours Truly

Let me tell you what I mean by that.

Warning: Active Record Users

Repositories tend to work with ORMs even though is not a requirement, its very common practice. However, not any kind of ORM can be used for working with repositories. I think a word of warning is necessary for users of Active Record ORMs (Im talking about you, Yii and Laravel users). Ive read several blog posts (like this one, or this other one) that promise an implementation of repositories the Laravel Way, which is not the repository pattern, but a poorly abstracted interface over Eloquent. Dont get me wrong: Active Record ORMs are great at what they do (which is, in my opinion, to provide an nice API over records on a database) but unfortunately, they just dont fit the requirements for the repository pattern. Dont try to use Active Record ORMs for repositories, they just dont fit the use case. If you are using Active Record, embrace the fact that you already coupled your data model to your persistence layer. If you wont take my word for it, take Jeffrey Ways.

Repositories are Abstractions

Just to continue with the thread, the main reason why Active Record ORMs dont fit the repository pattern is because repositories are abstractions, and Active Record Data Models are not. When you create a data model in Laravel, for example, you are not fetching a pure data class, but a whole lot of other stuff related to persistence, like your database connections, mutators and all sorts of stuff. All that lives in your data model, and that renders it unusable for the level of abstraction required for the repository pattern.

To be fair with the Eloquent guys, this is true of Doctrine repositories also. If you are using doctrine repositories as they are, you are not abstracting anything away. You are coupled to Doctrine, which is in turn coupled to a relational database engine. That leaves you in the same place as using Eloquent (a bit better though, because your data model is a pure data class).

In the Symfony world, its common to see something like this:

class SomeController{    public function someMethod(Request $request): Response    {        // This repository is the doctrine's library one        $repo = $this->getRepository(User::class);        $users = $repo->findAll();        return $this->json($users);    }}

If you do this, stop. You are not using a proper abstraction here. Its true: the Doctrine repository is an abstraction over the EntityManager, QueryBuilder, Connection and a bunch of other stuff, but is a doctrine-specific abstraction. You need a domain-specific abstraction. One abstraction that is only yours, your own contract.

So what we should do then? We just define an interface:

class User{    // This is your data class}interface UserRepository{    /**     * @return iterable|User[]     */    public function all(): iterable;    public function add(User $user): void;    public function remove(User $user): void;    public function ofId(string $userId): ?User; }

This is a proper abstraction. Your User class is a class that just contains data. Your UserRepository interface is your contract. You can use the Doctrine repository behind it, but it wont matter this time, because you will type hint the interface to all other classes using it. This way you effectively decouple yourself of any persistence library/engine and get an abstraction you can use all around your codebase.

Repositories are Specific

Note how the UserRepository we defined is model specific. A lot of people like to save work by creating a generic repository that becomes no more than a query abstraction over the persistence library used. Just dont do this:

interface Repository{    /**     * @return iterable|object[]     */    public function all(string $repositoryClass): iterable;}

Remember one of the principles of DDD: clear language intent. One repository interface for each model conveys more meaning to that specific repository/model than a generic one. For example, only users can be filtered by email, not buildings.

Besides with one generic repository for everything, you wont be able to type your concrete model classes to the return or argument types. Its the longer route but is the most convenient and flexible.

UPDATE: I still maintain this is not a great idea, but the typing argument is probably not as valid as it used to be since the PHP tooling ecosystem has embraced generics annotations so well that this solution is workable from a type-system point of view. Just a friendly reminder that because you can do something, it does not mean that you should.

Repositories are Collections

I would say that the Aha! moment in repositories for me is when I realized that they are just an abstraction over a collection of objects. This blew my mind and gave me a new challenge; the challenge of implementing repositories as if they were an in-memory collection.

For starters, I dumped all methods like all(), allActiveUsers() or allActiveUsersOfThisMonth(). If you have read the two famous posts about taming repositories, first the one of Anne at Easybib and then the one of Benjamin Eberlei in response, you should know that methods like that in a repository can grow wild. Also, the specification pattern is great, but it is quite complex to implement well for this particular use case: we can do better and simpler than that.

Collections APIs have many distinctive features. You can slice collections, filter them, add or remove elements from them, as well as finding specific elements. But we dont want a general collection API, remember? We want to implement a specific API for every model, so it conveys meaning.

So, our UserRepository interface could look this way:

interface UserRepository extends Countable, IteratorAggregate{    public function add(User $user): void;    public function remove(User $user): void;    public function ofId(string $userId): ?User;    public function ofEmail(string $email): ?User;    public function withActiveStatus(): self;    public function registeredAfter(DateTimeInterface $date): self;    public function registeredBefore(DateTimeInterface $date): self;    public function getIterator(): Iterator;    public function slice(int $start, int $size = 20): self;    public function count(): int;}

Pay special attention to the last three methods. These are the only methods that could potentially be in a Repository base interface, because all of them will be sliceable, countable and iterable.

interface Repository extends IteratorAggregate, Countable{    public function getIterator(): Iterator;    public function slice(int $start, int $size = 20): self;    public function count(): int;}

So by doing this, all of your repositories will be sliceable (think pagination there), iterable and countable. The idea is that you apply the filtering methods (all the methods that return self) and then iterate to execute the internal query, just like an in-memory collection. You wouldnt note the difference at all if an implementation is switched to another one.

This is good OOP. All the persistence details are completely hidden from us, the API is composable and fits our needs for a repository. It looks neat and using it is simple and easy to understand:

class SomeService{    public function __construct(UserRepository $users)    {        $this->users = $users;    }    public function someMethod()    {        $users = $this->users            ->withActiveStatus()            ->registeredBefore(new DateTime('now'))            ->registeredAfter(new DateTime('-30days'));        $count = $users->count();        return $users;    }}

But heres the question: how do we go about implementing an API like this? If you are a good observer, you might have realized that the filters return an instance of themselves, modifying the internal state of the repository. So in the next query, we will have the filters of the previous query applied, right?

Repositories are Immutable

Well, that could be right, if we really are modifying the internal state. But in reality, we are cloning the repository reference, preserving the original one not to affect subsequent queries accidentally. This is an implementation detail, but a very important one. If we change, lets say, the state of the repository reference that lives inside our DI Container, then we are done: we cannot use that reference again. So the idea is to make it immutable.

Let me show you the final API, implemented in Doctrine ORM. Im going to write some comments and doc blocks in the code explaining some things.

declare(strict_types=1);namespace RepositoryExample\Common;use Doctrine\ORM\EntityManagerInterface;use Doctrine\ORM\QueryBuilder;use Doctrine\ORM\Tools\Pagination\Paginator;use Iterator;/** * Class DoctrineORMRepository *  * This is a custom abstract Doctrine ORM repository. It is meant to be extended by * every Doctrine ORM repository existing in your project. *  * The main features and differences with the EntityRepository provided by Doctrine is * that this one implements our repository contract in an immutable way. *  */abstract class DoctrineORMRepository implements Repository{    /**     * This is Doctrine's Entity Manager. It's fine to expose it to the child class.     *      * @var EntityManagerInterface     */    protected $manager;    /**     * We don't want to expose the query builder to child classes.     * This is so we are sure the original reference is not modified.     *      * We control the query builder state by providing clones with the `query`     * method and by cloning it with the `filter` method.     *     * @var QueryBuilder     */    private $queryBuilder;    /**     * DoctrineORMRepository constructor.     * @param EntityManagerInterface $manager     * @param string $entityClass     * @param string $alias     */    public function __construct(EntityManagerInterface $manager, string $entityClass, string $alias)    {        $this->manager = $manager;        $this->queryBuilder = $this->manager->createQueryBuilder()            ->select($alias)            ->from($entityClass, $alias);    }    /**     * @inheritDoc     */    public function getIterator(): Iterator    {        yield from new Paginator($this->queryBuilder->getQuery());    }    /**     * @inheritDoc     */    public function slice(int $start, int $size = 20): Repository    {        return $this->filter(static function (QueryBuilder $qb) use ($start, $size) {            $qb->setFirstResult($start)->setMaxResults($size);        });    }    /**     * @inheritDoc     */    public function count(): int    {        $paginator = new Paginator($this->queryBuilder->getQuery());        return $paginator->count();    }    /**     * Filters the repository using the query builder     *     * It clones it and returns a new instance with the modified     * query builder, so the original reference is preserved.     *     * @param callable $filter     * @return $this     */    protected function filter(callable $filter): self    {        $cloned = clone $this;        $filter($cloned->queryBuilder);        return $cloned;    }    /**     * Returns a cloned instance of the query builder     *     * Use this to perform single result queries.     *     * @return QueryBuilder     */    protected function query(): QueryBuilder    {        return clone $this->queryBuilder;    }    /**     * We allow cloning only from this scope.     * Also, we clone the query builder always.     */    protected function __clone()    {        $this->queryBuilder = clone $this->queryBuilder;    }}

This API can be improved of course, but the main principle is the immutability of it. Note how we dont expose the QueryBuilder. This is because its dangerous: an inexperienced developer could apply filters to it and mutate the original reference, causing a massive bug. Instead, we provide two convenience methods for child classes, filter and query. The first one takes a callable which in turn takes a cloned instance of the QueryBuilder as an argument. The second one just returns a cloned QueryBuilder so the child class can query anything.

Then, we use that API in our UserRepository and implement the remaining methods.

declare(strict_types=1);namespace RepositoryExample\User;use DateTime;use Doctrine\DBAL\Types\Types;use Doctrine\ORM\EntityManagerInterface;use Doctrine\ORM\NonUniqueResultException;use Doctrine\ORM\NoResultException;use Doctrine\ORM\QueryBuilder;use DomainException;use RepositoryExample\Common\DoctrineORMRepository;/** * Class DoctrineORMUserRepository * @package RepositoryExample\User */final class DoctrineORMUserRepository extends DoctrineORMRepository implements UserRepository{    protected const ENTITY_CLASS = User::class;    protected const ALIAS = 'user';    /**     * DoctrineORMUserRepository constructor.     * @param EntityManagerInterface $manager     */    public function __construct(EntityManagerInterface $manager)    {        parent::__construct($manager, self::ENTITY_CLASS, self::ALIAS);    }    public function add(User $user): void    {        $this->manager->persist($user);        // I usually implement flushing in a Command Bus middleware.        // But you can flush immediately if you like.    }    public function remove(User $user): void    {        $this->manager->remove($user);        // I usually implement flushing in a Command Bus middleware.        // But you can flush immediately if you like.    }    public function ofId(string $id): ?User    {        $object = $this->manager->find(self::ENTITY_CLASS, $id);        if ($object instanceof User) {            return $object;        }        return null;    }    /**     * @param string $email     * @return User|null     */    public function ofEmail(string $email): ?User    {        try {            $object = $this->query()                ->where('user.email = :email')                ->setParameter('email', $email)                ->getQuery()->getSingleResult();        } catch (NoResultException $e) {            return null;        } catch (NonUniqueResultException $e) {            throw new DomainException('More than one result found');        }        return $object;    }    public function withActiveStatus(): UserRepository    {        return $this->filter(static function (QueryBuilder $qb) {            $qb->where('user.active = true');        });    }    public function registeredBefore(DateTime $time): UserRepository    {        return $this->filter(static function (QueryBuilder $qb) use ($time) {            $qb->where('user.registeredAt < :before')                ->setParameter(':before', $time, Types::DATETIME_MUTABLE);        });    }    public function registeredAfter(DateTime $time): UserRepository    {        return $this->filter(static function (QueryBuilder $qb) use ($time) {            $qb->where('user.registeredAt > :after')                ->setParameter(':after', $time, Types::DATETIME_MUTABLE);        });    }}

The result is nice to work with. Ive taken this approach in several projects so far and it feels great. The method names convey meaning and work well. Creating different implementations like a Doctrine Mongo ODM, Filesystem or In-Memory its trivial. Implementors just need to take into account the immutability aspect of it, but thats all really.

A Posteriori Advice and Warnings

If I'm writing this section now after three years, that is a good thing. First, it means that I still use this pattern all the time. Secondly, it means that I use it better than before. So, here are some nuances that I've come to consider over time.

Limit these APIs to what your Command Handlers need

I've come to exercise more reluctance to use my repository interfaces in every place in the codebase that I need to query for something. I think they are great for using them in my command handlers, because of the expressiveness and the richness of the API of my models. The whole point of them is to make state mutations as clear and close to the domain language as they can be.

However, for most read operations that will end up sending some data over the wire as JSON, I think using the repository pattern as presented here is overkill. The REST API is not a domain concern. It seems pointless to fetch primitives from a database, hydrate them into those rich domain objects (and pay the price for that in performance) to just discard all that and pass them through a serializer that will convert them into primitives again.

My preferred approach now for sending data over the wire as JSON representations is to have a single service that knows about the Entity Manager and does all that. Usually, that service turns off hydration and knows certain things like properties that should be not exposed, etc.

Maybe I do this because I've realized that querying is not a Domain concern if the results of that query are not going to be used in a business action. Again, returning records for a JSON API is not a business action, so I don't pass that through domain objects or constructs like my models and my repositories. This might be the reason why Matthias Noback considers query buses unnecessary, and I very much agree with his reasoning.

However, when you need to filter and query specific things for implementing a state mutation, is when you need all the power and expressiveness that a rich domain and repositories implemented this way give you.

Naming Interfaces in PHP, Java, et al.

Matías Navarro-Carter — Mon, 03 Apr 2023 11:00:40 GMT

I've written about this already in a previous post, but I think it deserves a post of its own. I think it is time we stop appending the word Interface to our interfaces name.

It is just completely unnecessary.

The same goes for anything similar: prepending Abstract to abstract classes or suffixing traits with the word Trait, or appending Exception to exception classes.

It makes sense for three main reasons.

First, it makes sense from an object-oriented naming point of view. It makes sense that the most abstract thing has the most "pure" name in a cluster of implementors. For instance, Symfony Serializer has SerializerInterface and the default serializer implementation is called Serializer. Doesn't make more sense to have Serializer as the interface (because that is the thing being abstracted) and then call the implementation DefaultSerializer? Notice how instead we prepend the implementation name to say something specific about the implementation instead. It makes sense, because the implementation is specific, while the interface is generic.

Second, it makes sense from a screen real-estate point of view. Again back to Symfony serializer, which implements five interfaces that have the word Interface suffixed on their name. So we are using 45 extra characters we don't need, over half of the recommended maximum characters per line (80). And again, we don't need those characters. They are just unnecessary pollution.

Third, it makes sense from a modern tooling point of view. The only possibility that makes sense for keeping this practice, is that we do it to be able to easily identify what is an interface when reading our code. The problem with the argument though is that code assistance tools are pretty good at telling us what the stuff in our code is. They even put different icons on our IDE for interfaces, abstract classes and normal classes. They also autocomplete only interfaces when we type implements and they show what a type is when we hover over the name. I understand this could have been a good reason back in the days when we wrote software with simple text editors. But not today.

I hope I have convinced you. Please help me stop this madness that has completely taken the PHP ecosystem (and others!) by storm. We don't need to do this.

Interfacing is Decoupling

Matías Navarro-Carter — Thu, 30 Mar 2023 06:15:25 GMT

The Coder's Proverbs is a series where I summarize some lessons and principles I've learned over my career by using a memorable and simple saying of wisdom.

I think this is one of the most incredible inventions on earth.

I'm not talking about the UK electrical outlet in specific, but of the electrical outlet in general. Think about how our lives would be without electrical outlets, and just by having the cables there, ready to be used. Just like this:

We would need to wire up all the appliances in our homes manually, from cable to cable. Apart from being something extremely dangerous, it would be a slow and cumbersome process. Imagine you just need to temporarily unwire a lamp to connect a vacuum cleaner. It would take loads of time! Also, you could get the wires all mixed up: connect the ground to the live and the live to the neutral, and so on. All of our devices would be coupled together to the point of being too hard to change.

The convenience of the electrical outlet is that it removes all that complexity. My appliances only need to implement an interface that conforms to the outlet (the plug) to be able to be correctly connected and be easily swappable to any of the outlets in my house. It removes the need for me to have low-level knowledge about electricity and wiring. It's just so much simpler.

Outlet Mentality

When building software, we should be thinking like the inventor of the electrical outlet at all times. But many times we don't think like that. For instance, if our program logic requires us to write a report, we immediately write it to the filesystem. We do something like this:

class ReportWriter{    public function writeReport(array $records): void    {        $resource = fopen('/some/file.path', 'wb');        foreach ($records as $i => $record) {            $line = sprintf('Record %d: %s', $i, $record['contents']);            fwrite($resource, $line);        }        fwrite($resource, PHP_EOL);        $end = sprintf('Number of records: %d', count($records));        fwrite($resouce, $end);    }}

The code above is like connecting the appliances in your house (your business logic) directly to the electrical wires (the filesystem). Here, we have high-level business logic (the writing and structuring of a report), depending on something low-level (the place where it is stored, the filesystem).

It's much better if we make an abstraction for writing (we don't care where we write), and make both our business logic and our filesystem depend on it.

// This is the abstractioninterface Writer{    public function write(string $data): int;}// The filesystem implementing the abstractionclass PhpResource implements Writer{    public static function open(string $filename): PhpResource    {        return new self(fopen($filename, 'wb'));    }    private function __construct(        private $resource    ) { }    public function write(string $data): int    {        return fwrite($this->resource, $data);    }}// The business logic using the abstractionclass ReportWriter{    public function __construct(        private readonly Writer $output,    ) { }    public function writeReport(array $records): void    {        foreach ($records as $i => $record) {            $line = sprintf('Record %d: %s', $i, $record['contents']);            $this->writer->write($line.PHP_EOL);        }        $this->writer->write(PHP_EOL);        $end = sprintf('Number of records: %d', count($records));        $this->writer->write($end.PHP_EOL);    }}// This is how you bootstrap it$resource = PhpResource::open('/some/file.path');$reportWriter = new ReportWriter($resource);

Now the high-level business logic does not need to know about the filesystem. And the low-level stuff (writing to a file in the filesystem) has also been simplified under the Writer interface. The Writer interface is our electrical outlet: is what makes it possible to connect our business logic to the filesystem without them knowing anything about each other.

This decoupling is powerful. This is what makes programs to be resilient and also easy to test. Because we don't depend on the filesystem now, while testing, we can have an in-memory Writer in which we can assert that the contents were written as intended.

This is how you decouple software components: by putting an interface in between them. Interfaces, like the electrical outlet, are one of the best inventions since Object Oriented Programming itself.

Conclusion

This is just a summary of the Dependency Inversion Principle in SOLID, which states that "High-level modules should not depend on low-level modules, but rather both should depend on abstractions". In our example above, high-level business logic (the writing of a report) was depending on the low-level filesystem operations (for writing a file). Introducing an abstraction that both parties rely on, making it possible to decouple them. Now our business logic can be freely used with any type that implements Writer.

Generally speaking, making your business code rely on abstractions (interfaces) is the best way to decouple it from other things. This is how I write my programs nowadays: I don't even start with the database or the HTTP framework, but rather, I design commands and handlers that rely only on interfaces to do the business actions I need, and then I implement them later. This ensures I focus on the things my business actions need. This approach has the nice benefit that it completely decouples your code from a framework or database library, making it more robust, easier to test and easier to change.

Just remember: interfacing is decoupling.

The Bigger the Interface, The Weaker the Abstraction

Matías Navarro-Carter — Wed, 22 Mar 2023 16:00:39 GMT

The Coder's Proverbs is a series where I summarize some lessons and principles I've learned over my career by using a memorable and simple saying of wisdom.

In previous articles, we have been talking a bit about abstractions: specifically, how to design good ones. The last article was about keeping an eye on leaking implementation details from the public API of an abstraction (the interface definition). Today, we'll talk about keeping the size of our abstractions small, so they can be used effectively.

Before I continue with this article, I must give credit where credit is due. I heard this proverb actually from Rob Pike in his Go Proverbs talk. I don't know if he heard it from someone else or not, but I heard it from him. His explanation is rather short and he claims it to be "a very go specific idea" but I think this principle is quite applicable to other languages as well.

When I'm explaining this to other developers I always point to the StreamInterface in the PSR-7 standards. It's a massive interface that breaks this idea from top to bottom. Here is a list of what this interface does:

It writes (write)
It checks it can write (isWritable)
It reads (read, getContents, __toString)
It checks it can read (isReadable)
It closes (close)
It seeks (eof, rewind, tell, seek)
It checks it can seek (isSeekable)
It exposes implementation details
1. You can get the underlying PHP resource (detach)
2. You can get the resource metadata (getMetadata)

There are several problems with this interface. First, it does too much. Sometimes we just need a thing where to write data or a thing from where to read. But now any I/O operation with this interface requires that we implement all 14 methods, even if they are not relevant to our use case. This makes composition harder because the API surface to decorate is bigger.

Also, the interface "lies" in a certain way. It says it reads (it has a method for it) but first you need to check if the interface supports reading by invoking isReadable. Same for writing and seeking. This capability checking by using methods is an anti-pattern. Types like interfaces are made to express the idea of capabilities using the type system.

And, because this interface leaks implementation details (it's pretty obvious that it is using a PHP resource) it can only be implemented in that way. We have covered this in the previous article by the way.

Breaking Down the Monster

This interface could be broken down into multiple interfaces:

interface Reader{    /**      * Reads some bytes from a source.      *      * @throws IOError if there is an error while reading      * @throws EOFError if the end of file has been reached      */    public function read(int $bytes): string;}interface Writer{    /**      * Writes some bytes to a target.      *      * @return int The number of bytes written      *      * @throws IOError if there is an error while writing      */    public function write(string $contents): int;}interface Closer{    /**      * Closes the underlying source      *      * @return void      *      * @throws IOError if there is an error while closing      */    public function close(): void;}interface Seeker{    /**      * Seeks to the specified position.      *      * @return int The new position of the pointer      *      * @throws IOError if there is an error while seeking      */    public function seek(int $offset, int $pos = SEEK_CURRENT): int;}

This is only more powerful now thanks to PHP's 8.1 Intersection Types:

class Request{    public string $method;    public Uri $uri;    // Notice how we compose the two types becasue the body of a request can only be read and be closed.    public Reader&Closer $body;    public Headers $headers;}

Type Checking Instead of Method Calling

Instead of calling a method like isSeekable, now we can ensure the type hints for the required behaviour. But also, we could optionally be lenient and take a Reader and on runtime check if we need to seek to a particular point just to be safe:

function readAll(Reader $reader): string{    // Note how this is more robust and clear    // because it uses the type system    if ($reader instanceof Seeker) {        $reader->seek(0, SEEK_START);    }    $contents = '';    while (true) {        try {            $contents .= $reader->read(2046);         catch (EOFError $e) {            break;        }         }    return $contents;}

Granted: the conditional logic here would be no different than in the StreamInterface by using isSeekable instead of instanceof. But still, it's more powerful because of the type-system: you can enforce the Seeker by type hints and ensure your program will behave correctly.

Think About Capabilities, Not Mere Wrappers

Sometimes, when designing abstractions that may have many methods, it's more useful to think about the abstraction as a collection of capabilities rather than just a mere wrapper around an implementation.

I had this problem when I created a custom abstraction over certain payment gateways in a project I was working on. In the beginning, I started with a big interface called PaymentGateway. It seemed logical at the time:

interface PaymentGateway{    public function authorize(Payment $payment): Authorization;    public function capture(Money $amount, Authorization $auth): Capture;    public function refund(Money $amount, Authorization $auth): Refund;}

But then not all the Payment Gateways I was implementing supported deferred capture of funds, or refunds, so I ended up adding the following methods:

interface PaymentGateway{    public function authorize(Payment $payment): Authorization;    public function capture(Money $amount, Authorization $auth): Capture;    public function refund(Money $amount, Authorization $auth): Refund;    public function supportsCapture(): bool;    public function supportsRefunds(): bool;}

That was a bad idea. A Payment Gateway basic mission is to authorize the transfer of funds on behalf of a customer. Capturing and refunding are secondary capabilities that not all gateways possess and support. It would have been better to separate those capabilities into their own interfaces and model them as separate types.

interface PaymentGateway{    public function authorize(Payment $payment): Authorization;}interface Capturer{    public function capture(Money $amount, Authorization $auth): Capture;}interface Refunder{    public function refund(Money $amount, Authorization $auth): Refund;}

This way, a WorldpayPaymentGateway could implement all three of these methods, but others won't.

Conclusion

Big abstractions are weaker. They are harder to implement and often lie about the methods they support, and if you are not careful, they might even end up exposing implementation details.

What I've explained here is really the Interface Segregation Principle. This principle states that a client should not rely on methods they don't use. The original reason is that in certain languages this undesired dependency might trigger a chain of recompilation of other modules. But also from a maintenance perspective, big interfaces make implementation and composition much harder to do than they should be.

Remember then, the bigger the interface, the weaker the abstraction.

What You Really Need To Know About Testing

Matías Navarro-Carter — Tue, 21 Mar 2023 09:29:32 GMT

Sometimes discussions on testing frustrate me beyond measure. They do because everyone is assuming so much on either side. They take for granted that the world shares their notions of the differences between a stub, a mock and a spy. They believe that the definition of a unit, functional, integration, acceptance or end-to-end test is obvious to everyone. They render common knowledge the fact that you should have X, Y or Z proportion of each type of test in your codebase. They battle about which testing framework to use based on their features, or what patterns to implement in their tests. Go developers still are battling between using test tables or just simple functions.

All this frustrates me. It does because testing discussions become focused on the details and not on the bigger picture and goal of testing.

The bottom line, tests are little programs that assert things about your code. There are only two main requirements for them: (1) they must run quickly and, (2) they must fail clearly and for useful reasons.

You might cover all of your applications with a Cypress suite, but then good luck running that on a CI: it will become your delivery bottleneck. Or try to hit the database in every single one of your tests that use PHPUnit: that's going to take a while to run. Tests are meant to give us feedback quickly. When working with a codebase, we need to know if we broke something with our changes. We can't wait 5 minutes for that.

Is of little use to have everything covered with integration or acceptance tests if every valid change to the codebase is going to break half of your tests. You need to think about what you are asserting on those tests. Assert structure and not just compare strings. Write validators if necessary for creating complex assertions more easily. Also, assertions need to be constructed to give the developer an accurate error message, so they can identify the source of the problem. Tests should fail clearly.

Again, make sure your tests run quickly and fail for useful reasons. Unit tests that overuse mocks tend to run very fast, but they fail often with valid code changes. Integration tests are a bit slower and rely on heavy dependencies sometimes, but tend to fail for correct reasons. End-to-end tests are expensive to run and very slow, but if you assert correctly about them, they can detect valid failures.

Find the test strategy that better adjusts to your team's knowledge and the kind of project you are working on. Once you have it, keep improving your testing suite by eliminating slow and flaky tests. That's all that matters really.

The Goal of Software Engineering

Matías Navarro-Carter — Mon, 20 Feb 2023 16:00:39 GMT

For almost the whole extent of my career in Engineering, I have thought about the answer to the following question:

What is the goal of Software Engineering?

When I started -- and for a good while -- I believed the goal of Software Engineering was to produce working software that was useful to their users. After all, what is the point of writing Software if is not going to be used by anyone? This is what I call the pragmatic goal.

But then, later down the line, I came to believe that the true goal of Software Engineering was to produce software that was easy to maintain and evolve by their developers. After all, what is the point of writing Software that will have to be re-written every couple of years? This is what I call the dogmatic goal.

During my career, I've met people leaning toward one side more than the other. I hate to generalise, but followers of the pragmatic goal tend to be very effective at accomplishing tasks and delivering value, but pay very little attention to design and evolvability, and make architectural decisions that happen to be very costly later down the line. On the other hand, followers of the dogmatic approach tend to be more focused on theory, patterns, methodologies, abstraction and processes rather than thinking about how to effectively solve a business problem and deliver value to users. They oftentimes end up coming up with over-engineered solutions that are unnecessarily complex or that solve the wrong problem because they lack user perspective.

This reality is engineering's own Apollonian and Dionysian sort of tension. I would say you can classify every engineer on earth in one of these two opposing camps, and that they usually are accompanied by other traits and characteristics opposed to the ones of the other side, in a truly remarkable way.

Pragmatism	Dogmatism
User-centred	Engineer-centred
Focus on Usefulness	Focus on Correctness
Present Looking	Future Looking
Results Driven	Process Driven
Simplicity	Complexity
Naiveness	Pessimism
Emotive	Analytical
Delivers Value	Exhibits Knowledge
Seeks Freedom	Seeks Order
Flexibility	Uniformity
Concretions	Abstractions

You might read this list and immediately identify with one side of the spectrum. Well, I'm here to tell you that if you easily identify yourself with one side, then you might want to start pushing to the other.

Neither of these camps is correct by itself: both put the focus on things that are true and valuable. I think the true goal of software engineering is to produce software that is both easy to use by their users and easy to maintain by their developers. And the biggest challenge is to successfully navigate the tension that exists between both.

For instance, let's take the pragmatic approach. You could build the most useful piece of software, but if it is poorly designed, then you can know for sure it will cease to be useful because a rewrite will be needed. I've seen systems that are awful to work with and that quickly burn out their engineers because the last thing someone thought about was how to make it maintainable. Sooner rather than later, your engineering turnover rate cost will be much more than what your business can handle. You will have a really good idea or product, but it will be impossible to scale.

On the other hand, you can build the most maintainable software. And you might have no issues finding engineers that want to work in a well-designed software or system. But can get so focused on correctness and maintainability, that you might be late to the market or worse even, you discover that the market has invalidated many of the hypotheses in which you built your product.

There is no escaping from the fact you need both. A good engineer is capable of being both pragmatic and dogmatic at the same time. And this is no easy feat: you will over-steer to one side or the other quite often, so we must be constantly reminded of this.

The goal of Software Engineering is then to produce software that is both easy to maintain by their developers and that is useful to their users. Is part of your job to have both things in mind when exercising your trade, and you must not compromise on any of them.

The Power of Aesthetics

Matías Navarro-Carter — Sun, 19 Feb 2023 10:00:39 GMT

Back in the day, I used to be a pretty naive fella (I would like to believe I'm a bit more shrewd now). I used to think the world and the people in it were, at their core, very simple and straightforward, and that human relations and understanding were not far for anyone to comprehend and master. I used to think in very black-and-white terms, in many areas of life. And although I still have deep convictions and beliefs about the world, human nature and relationships, I often time tread more carefully and with more nuance than in the past concerning certain topics.

As an example, I am a firm believer that no-fault divorce is a sin. In my days as a bible school student, I used to wonder why people would think otherwise being a topic that is so clearly and unequivocally expounded in the bible. So I used to think that all I needed to convince people that divorce was wrong, was to memorize a few verses here and there and compellingly present them, asking whether they believed them or not. It was a purely logical approach that one could call "classical apologetics". The premise is that by merely presenting solid evidence, you can convince people of a certain idea or point of view.

I used to think being a minister or a teacher was something simple: you present people the truth, lie down the evidence, and people will believe it. I became an argument machine of some sort: I was interested to know which argument I could present to a given occasion or conflict so I could convince people of what I believed. I would like to believe that my intentions were not founded in just mere argument-winning, but in a genuine interest of leading people to a path I considered to be true and good, despite using a flawed method born of flawed assumptions.

The lethal assumption that classical apologetics makes (or of any other logical approach for that matter) is that humans act primarily in a rational way and that they make decisions and inform their beliefs by the use of reason. Nothing, though, could be farther from the truth.

It is a fact of life that you and I believe and do things sometimes based on intuition, on based on how nice, good or well something feels or looks. It is not just a matter of reason and logic, it is much more a matter of taste and aesthetics.

For example, what logic or reason powers the repulsion we would feel to use the same toothbrush as the person we have no problem kissing many times? Where is the logic in that? Isn't just kissing someone passionately a more risky activity in terms of germs exchange than sharing a tool used to clean? I'm inclined to believe this is a purely aesthetical construct, but it is however very much embedded in our behaviour.

In the same way, people that believe that divorce is not a sin, do so for more aesthetic reasons than they would like to admit. We could say the same about gay marriage or transgenderism. It is not that these issues have a logical cause. People did not believe them because they were convinced by a logical argument, but rather because in their brains, they came to be plausible, nice and even desirable things. They started to look good, in comparison to the stigma they carried in the past. They became worthy of defence and pursuit. They are believed in and accepted because of aesthetics.

Nowadays, it does not look okay to oppose gay marriage, divorce or transgenderism. J.K. Rowling is a perfect example of the clash between logic and aesthetics. She speaks a logical language to a population that is not interested in logical arguments because something beyond logic is invested into creating the conviction: to be against transgenderism would be ugly, even if that opposition comes from logical concerns for woman's rights.

The bottom of the matter is that society's recognition frameworks do not function logically: they are very much aesthetical. And the massive shape in traditional western cultures and values we have experienced in the last 6 decades has not been caused by a shift in arguments and logic. Quite the opposite: it is the product of a society in which how something looks or feels has profound value.

Aesthetics have tremendous power in shaping a culture. If you manage to present a good story, a sugar-coated narrative that appeals to people's emotions, you can shape the culture in unimaginable ways. It does not matter anymore if God said "you shall not eat of the fruit of the tree", the serpent will always find a way to show you that "the fruit of the tree [is ...] pleasing to the eye", so then it can convince you to eat it.

Leaks Can Become Floods

Matías Navarro-Carter — Wed, 15 Feb 2023 16:00:39 GMT

The Coder's Proverbs is a series where I summarize some lessons and principles I've learned over my career by using a memorable and simple saying of wisdom.

So far we have talked about how we need to design having change in mind, and also established that the best changes are those that don't touch what already exists but add on top of it, and that we should design our applications so they can change that way. Today I would like to talk about being careful with our designs, particularly when we are designing abstractions.

What is an abstraction anyway?

Abstraction's only purpose is to help you simplify a problem or an API behind a stable contract. Some people see abstractions as something to be avoided like a plague and I understand where they are coming from. People who warn about hasty abstractions usually use the term abstraction meaning the extraction of a duplicated routine to keep the code DRY. This is not the definition of abstraction I have in mind.

The definition of abstraction I will use here relates more to contract definition than avoiding code duplication. In other words, this is an abstraction:

interface TemplateRenderer{    public function render(string $template, array $data = []): string;}

TemplateRenderer is an interface, and as such, it is an abstraction over rendering a template. It does not specify how a template is rendered though (that's why is an abstraction), it just cares about the methods an implementation should have to be able to render a template.

Of course, abstractions are not only interfaces. Abstractions can be concrete classes too that assume an implicit "contract". But I'm more interested in interfaces this time.

Leaky Abstractions

The golden rule of designing interfaces as abstractions is that you should be able to swap implementations without needing to modify client code. So, if I have a TwigTemplateRenderer and a NativePhpTemplateRenderer implementing the TemplateRenderer above, I should be able to swap one for the other without changing anything in the code that uses the interface. If I need to change something, then I'm dealing with what is called a leaky abstraction.

We call leaky abstractions those abstractions that, intentionally or unintentionally, expose implementation details to client code. The fact that they expose implementation details forces you to change those details when you change the implementation.

There are two main ways in which this happens, at least in PHP, and that's through method arguments or exceptions. Let's suppose you have the following code using our above defined TemplateRenderer:

use Twig\Error\LoaderError;class SendWelcomeEmail{    private TemplateRenderer $renderer;    private Mailer $mailer;    public function __construct(        TemplateRenderer $renderer,        Mailer $mailer    )    {        $this->renderer = $renderer;        $this->mailer = $mailer;    }    public function handle(UserRegistered $evt): void    {        try {            $body = $this->renderer->render('emails/welcome.twig', [                'username' => $evt->username,                'name' => $evt->name,            ]);        } catch (LoaderException $e) {            throw new LogicException('Template does not exist', previous: $e);        }        $email = (new Email())            ->withBody($body)            ->withSubject('Welcome')            ->withFrom('My Awesome App ')            ->withTo(sprintf('%s <%s>', $evt->name, $evt->email))        ;        $this->mailer->send($email);    }}

So this code is a classical event listener for sending an email upon registration. You have a Mailer abstraction and a TemplateRenderer one. Can you spot the two problems with the TemplateRenderer one? I'll give you a clue. It has to do with the parts of the code that reference Twig.

The underlying problem here is that, although we are using an abstraction over rendering templates, the fact that we are using Twig as the internal engine leaks in this piece of code. It's pretty obvious to the code that this is using Twig. It has the Twig way of defining a template name (with slashes and with the .twig extension at the end) and it catches possible Twig exceptions. If I create another implementation of TemplateRenderer that does not use Twig but a native PHP renderer, I would need to change this code too.

This might sound like nitpicking, but it does have a big impact when your abstraction is widely used throughout the system. Although modern IDE tools would aid in changing all those references, not all developers have them. Plus, not all developers might know this is widely used and might just change the implementation anyway.

Fixing Leaks

You need to make sure that when you design an abstraction like a TemplateRenderer one, you think carefully about your use case. So let's fix this. First, it seems we need an error of our own to indicate that an irrecoverable failure happened when attempting to render a template. So let's define our own RenderError exception. And secondly, we need to make the way of identifying a template a bit more universal. Let's use dots and remove the extension.

use My\Namespace\Template\RenderError;class SendWelcomeEmail{    private TemplateRenderer $renderer;    private Mailer $mailer;    public function __construct(        TemplateRenderer $renderer,        Mailer $mailer    )    {        $this->renderer = $renderer;        $this->mailer = $mailer;    }    public function handle(UserRegistered $evt): void    {        try {            $body = $this->renderer->render('emails.welcome', [                'username' => $evt->username,                'name' => $evt->name,            ]);        } catch (RenderError $e) {            throw new LogicException('Welcome email template could not be rendered', previous: $e);        }        $email = (new Email())            ->withBody($body)            ->withSubject('Welcome')            ->withFrom('My Awesome App ')            ->withTo(sprintf('%s <%s>', $evt->name, $evt->email))        ;        $this->mailer->send($email);    }}

That's it. No trace of Twig anymore. Is up to the implementations now to catch the errors and transform them into RenderError and to modify the template name to change dots for slashes and add the file extension at the end.

So, when designing abstractions, take special care with the kind of arguments you are taking and ask yourself if they expose any of the internal details. And also, define exceptions that you are interested in communicating and make them part of the abstraction contract, so your consumers know what to catch and implementers know what to wrap.

Conclusions

I have done it again. This is the Liskov Substitution Principle applied to interfaces. Although this principle was first thought of in the context of subtyping with inheritance (the typical Square-Rectangle example) and modern OOP does not focus too much on inheritance anymore, it is still useful when considering interfaces, as they are a particular form of subtyping.

Abstractions that leak implementation details are one of those things few people pay attention to. Because it seems something so small and so trivial, is often dismissed or ignored. This might be the right response if the abstraction is not widely used. But if the abstraction turns out to be one of the centrepieces of your application, that leak can flood the rest of the application triggering a chain of change that could be hard to deal with.

So, keep an eye on your leaks, because they might become floods.

Dealing With Imposter Syndrome

Matías Navarro-Carter — Mon, 13 Feb 2023 16:00:39 GMT

Have you ever thought that your "success" is mostly because of luck or good timing? Or that you got that promotion because maybe no one else wanted to take it? Do you live in fear, constantly trying to show your peers and superiors that you are not underqualified? Do you think the praise you receive at work is because you are a nice gal or fella and not so much because it is a job well done? Let me tell you, you are not alone.

It's the secret best kept in the professional world: the vast majority of adults in the UK (85%) experience intrusive thoughts that make them feel inadequate or underqualified for their position, despite their experience. This is called Imposter Syndrome. Broken down by gender, 80% of men experience it versus 90% of women. These numbers come from an independent survey and based on my own personal, biased and, of course, limited experience, they are probably not too far from reality.

I have experienced it myself too and still do, although probably in a different way than you do. The way it manifests in me is that sometimes I think I've had some degree of success in life because I'm a pretty chilled, easy-going white fella with an education who can speak with enough confidence and think with enough substance. I do hope that I have some reasonable degree of self-awareness that those things I think of myself are true (I'll let other people judge that). However, true or not, the subtle spurious thought slides in one way or another and whispers: "that's the reason why you are where you are".

Have you ever felt the same? Are you feeling like that right now? Do you ever have thoughts suggesting that you are not good enough? Are you having them right now but suppressing them with affirmations? Well, I want to share with you what helps me when I feel this way. I'm no psychologist, and this is certainly not therapy, but I know a thing or two about the way people think, and about human nature. What I will say comes from that experience.

Imposter Syndrome can manifest in different ways, but it always starts with those spurious throughs about your inadequacy. Some people have the luck of having good mental coping mechanisms, so they can dismiss those thoughts promptly as they appear (but notice that the thought is always there). For other people, those thoughts can trigger subsequent thoughts that can quickly lead to anxiety. If that is your case, I would recommend you seek counselling from a mental health professional focused on helping you develop those coping mechanisms.

Is About Quality, Not Quantity

The sort of standard way I've heard people recommend to deal with this is to (a) seek words of affirmation from others and/or (b), affirm yourself. But affirmation is like treating a headache with morphine: you will feel good immediately, then you will crave it, and soon you would not be able to live without it. I don't know about you but I would rather have my sense of value lie in a more stable foundation than the amount of affirmation I receive. There is gotta be something better, right?

What helps me is to recognize the truth that I have never been and will never be adequate for any of the things I've done or will do. I cried my eyes out on my first day of school. I was terrified of giving my first kiss and making a fool of myself. When I switched schools, I wondered if I was going to fit in or be bullied again. When it was time to go to Uni and choose a career I had no idea how the world worked, and I thought I was not ready to choose something for life. When I had my first formal job I didn't know if I was going to live up to the challenge. When I left that job in my mid-twenties crisis I had no idea what I was going to do (it seemed I had returned to my seventeens in a very real way!). When my brother asked me if I could help him with a programming project I told him he was crazy and that it was too complex for me. When I got married I was not sure if I was prepared to selfishly love and care for someone else (I could barely do that for myself!). Now that we are seriously thinking about children, I do not feel capable at all to guide, nurture, care and be an example for a little human being, but somehow I still want to be a father.

In all these things I was never "adequate", "ready" or "prepared". Nope. No one prepares you for what is to come, because no one knows what might come and in what form. We are all learners, facing first-time challenges.

You might say "Yes, but some people have more experience because they have been in more jobs and roles, and they are educated, and they know what they are talking about". Indeed, but as every role, situation or organisation is different and every context unique, their experience might prove not adequate to that particular context. People might have different amounts of experience, but no one ever has the right one. No one. Is not a question of quantity, but quality. And on that, we all fall short.

One of the companies I worked for brought this high-profile executive role person. There was quite the hype and expectation before she joined us because of her credentials and the places where she had contributed previously (mostly fortune 500 companies). Everyone was looking forward to working with her. But after three months of her joining, everyone was pretty much hoping she would leave. And she did. Was she extremely qualified and experienced? Yes, very much so. Was her experience the right one in our context? Not at all.

So remember: you are not adequate for your role. That might seem bad news, but the good news is that no one is. Freed now of that weight, we can move on to the next step.

Break It Until You Make It

Some have a "Fake it until you make It" mentality. They seem to know very well that everyone is inadequate but no one seems to openly recognize it. They conclude, therefore, that everyone is faking to some degree and pretending to be something they are not. It follows that those who can fake it the best are the ones who will succeed. It is all about marketing and how you present that fake image of yourself.

You might have struggled with Imposter Syndrome before and maybe have received that recommendation. They seem to start under the right premise we have discussed so far: everyone is inadequate. However, the proposed approach is not just ethically questionable, but mentally taxable. Are you seriously telling me that my life needs to be about pretending to be something I'm not if I want to get somewhere? Who has the mental fortitude to do that?

Mark Zuckerberg once said, "move fast and break things". I'm not elevating Mark as a hero or a figure I look up to, but I think it is fair to say he gets a few things right here.

I think this quote is brilliant. In my view, it means that, since we were not born with experience and knowledge, but rather we acquire it along the way, we should try multiple things in short intervals of time and not be afraid of getting it wrong. After an iteration, you can adjust the course and try again. This is extremely liberating. You may be agreeing with the approach now but I assure you you still haven't got the depths of it.

He's saying that you are free to experiment. That means actively doing something trying to find out what works best, evaluating results, and correcting your course based on results. This is the classical methodology of investigation. And although some of us can apply this to our jobs somewhat well, when we apply this to our personal experiences, our emotions get massively on our way.

This is because experimenting often leads to one's admission that our original hypothesis was wrong, and that we have some stuff we need to correct. While most of us might be willing to acknowledge that an approach we tried might be wrong, fewer of us could admit that something we did might be wrong, and even fewer of us, something we believe in or we think about or ourselves.

You need to believe this: a "mistake" is just a verified hypothesis. A failure is just discarding a wrong path. There is way more value in recognizing that we need to change the erroneous course than remaining in the same course because we don't want to admit we went the wrong way. If you live pretending that you are capable, smart, well-prepared and experienced, you will have a hard time admitting you have made a mistake; for the simple reason that your inner self draws its value from the adequacy of the image you present.

If you embrace the learner's path, it frees you, it liberates you. You don't need to know everything. You don't need to be always right. You are constantly learning and discovering better ways. You place more value in other people's approaches and experiences and demonstrate a genuine interest in learning how they came to acquire it. You become more focused on refining your trade than pretending you are a master at it. And that, my dear friend, has way more value in the workplace than all the fancy titles, certifications, skills and recommendations you might amass. I'd rather have one learner, than a thousand pretenders. So move fast and break things. Break it until you make it.

Closing Thoughts

This turned out to move in a different direction than I originally intended, and way too long too. But sometimes is good to write words that come straight from the heart. I guess that, in a nutshell, what I intended to convey was that you must not be afraid to acknowledge your inadequacy. First, because no one is adequate; but secondly, because doing so will free you to use the power that honestly embracing one's limitations can exert on your development and of those around you.

Adding is Better Than Changing

Matías Navarro-Carter — Fri, 03 Feb 2023 16:00:39 GMT

The Coder's Proverbs is a series where I summarize some lessons and principles I've learned over my career by using a memorable and simple saying of wisdom.

In the previous article of this series, I touched on the subject of change. I mentioned change as the number one enemy of any software project, and I showed how we can defeat change by isolating the things that could change so that when change shows its ugly face, it does not affect the whole of our program. This is the Single Responsibility Principle slightly rephrased.

In this article, I will go beyond that idea and say that well-designed and robust code does not need changing. A well-designed system allows for improvements to be made and new features to be coded without modifying existing code but creating new code. Let me explain what I mean by that.

Let's come back to our Pipeline example from the previous article. In it, we defined an interesting interface we didn't touch much upon.

interface Step{     public function process(array $record): void;}

We made the Pipeline class use this interface to process a record, but we didn't define such a process; that's the point behind an abstraction. Anyhow, let's pretend we implement a step that writes records somewhere.

class WriterStep implements Step{    private Writer $writer;    public function __construct(Writer $writer)    {        $this->writer = $writer;    }    public function process(array $record): void    {        $this->writer->write($record);    }}

For the sake of the exercise, let's pretend that Writer just writes records to a database or some other form of persistent storage. The writer is not the important part here.

So, so far so good. This WriterStep is pretty neat. It does what it says it does. Until someone says: "We need to log the id of every record we are going to write". So, you might be tempted to do something like this:

class WriterStep implements Step{    private Writer $writer;    private Logger $logger;    public function __construct(Writer $writer, Logger $logger)    {        $this->writer = $writer;        $this->logger = $logger;    }    public function process(array $record): void    {        $id = (string) ($record['id'] ?? 'unknown');        $this->logger->log("Writing record of id '$id'");        $this->writer->write($record);    }}

The problem with this approach is that it has "touched" existing code and code that was working perfectly fine without the Logger being in there. Is not that we are afraid of bugs: we have tests for that. But we have added another responsibility to the WriterStep. Now, it not only writes, but it also logs stuff. This class has two reasons to change now. What happens when we need to log the id in some of the pipelines and not others? Both writing and logging are coupled now, and they are impossible to separate.

When you modify a piece of code that was doing a perfectly fine job as it was, is usually an alert. There is almost always an alternative to this. The best changes in a software project are those changes that don't touch what is already there but build on top of it. Adding new stuff is much better than changing existing stuff.

A system must be well-designed to allow for that to happen. If the system you are working on is poorly designed you might not be able to follow this principle, and it might be a good idea to consider refactoring. However, when you are the one designing the system, you must make sure to provide for this kind of change: it usually involves designing and using good interfaces or abstractions.

In this case, the design we defined in the previous article allows for this without a problem. It is perfectly possible to add logging without even touching the existing WriterStep class. For this, we use a technique called composition. This is how it looks:

// The WriterStep stays exactly the sameclass WriterStep implements Step{    private Writer $writer;    public function __construct(Writer $writer)    {        $this->writer = $writer;    }    public function process(array $record): void    {        $this->writer->write($record);    }}// The new LoggerStep wraps a Step and it is also// a step itself. This is composition.class LoggerStep implements Step{    private Step $next;    private Logger $logger;    public function __construct(Step $next, Logger $logger)    {        $this->next = $next;        $this->logger = $logger;    }    public function process(array $record): void    {        $id = (string) ($record['id'] ?? 'unknown');        $this->logger->log("Writing record of id '$id'");        // We pass to the next step, that could be the WriterStep        $this->next->process($record);    }}// Instead of passing the WriterStep as is to the pipeline, you pass a compisition of both the WriterStep and the LoggerStep.$pipeline = new Pipeline(    new LoggerStep(        new WriterStep($writer),        $logger,    ),);

This achieves the requirement of logging every id that is going to be written, but it keeps logging and writing as completely separate steps. Going back to the first principle, changes in one class should not affect the other. But now we have gone even further: we have delivered a new feature or capability without touching existing code.

I need to make a disclaimer here. It stands to reason that this principle does not mean that a codebase should not have any changes at all; that would be silly. As we have seen, the bootstrapping code in our above example (the code that created the pipeline) changed. And this is fine because this code does not contain our business logic, it's just glueing things together.

Summary

I think I got you again! What I have explained here is just the Open-Closed Principle rephrased. Many people think that this principle has exclusively to do with inheritance, but it can also be applied using composition as we have seen here. And it applies not just to classes, but to systems as well (i.e. pluggable architectures). Any sort of technique that lets you add extra behaviour to a program without modifying it, would be following this principle. This means the system would let you add behaviour (here lies the openness of it) but without modifying it (here lies the closeness).

So remember to design classes and systems that allow their functionality to be augmented, not by modifying what already is there, but by adding new things that are not there. Adding is much better than changing.

Isolating Your Enemy Wins You The War

Matías Navarro-Carter — Fri, 27 Jan 2023 14:25:56 GMT

The Coder's Proverbs is a series where I summarize some lessons and principles I've learned over my career by using a memorable and simple saying of wisdom.

Is kind of von Clausewitz's wisdom that to win a war you must weaken your enemy. The prime way of doing it is by isolating them: cutting supply lines, so they cannot receive ammo, food, clothing and other assets; or cutting their communication channels, so they cannot receive instructions or situational updates. Isolating your enemy is key to winning a war.

We can apply this principle to Software Engineering so we can win the war that (sometimes) is writing maintainable and robust code. But, who is our enemy?

I can say with full confidence that the number one enemy of Software Engineering is change. Not the project people, not the product people, not the less inexperienced members of your team, not your language of choice, none of that. The vast majority of your challenges come from change.

Think about it. You would not be working on that feature if a client had not requested it. You wouldn't be struggling with that integration if your users had not demanded it. You would not be fixing that bug if it had not been reported. You would not be refactoring the module you wrote last week if the requirements had not been refined. The sad reality for us (and also the blessing) is that Software is never static: it evolves and is constantly changing. If the software is not changing, then there is no one writing code, and then no suffering developer. Change is an unavoidable reality in Software and the number one cause of rot. Entropy, I think, is the correct term to define this reality.

So, to win in the Software Development war -- Oh dear, I sound like one of those LinkedIn influencers! -- you must isolate your enemy. In other words, you must isolate the things that can, might or will change in your code.

Take a look at the following piece of PHP code.

function importCountries(){    $url = 'https://restcountries.com/v3.1/all';    $contents = file_get_contents($url);    $data = json_decode($contents, true);    foreach ($data as $country) {        // Real code will do something more meaningful here        echo $country['name'].PHP_EOL;    }}

Now, there is a bunch of stuff that could be wrong with that code, depending on how possible is for it to change. Remember, code is not the enemy: it is change! If this is code for some automated routine for your toy scrapping project, then it is fine. If this is part of a data-importing application pipeline, then this is severely wrong.

Let's enumerate the possible things that might change in that context.

What happens if I want to make the data source dynamic and not just get country data from Rest Countries?
What happens if the format then, changes? Can I consume YAML, CSV or XML?
What happens if the action I want to perform on the data changes according to the data?
What happens if I want to take note of the records whose action has failed?
What happens if I want to apply transformations and actions that are repeatable?
What happens if I need to import 500,000 records? Will the system run out of memory?

It is clear that this code is far from ready to handle all those possible scenarios or requests. It needs to isolate the things that might change so it can support the actual change. And the number one way of isolating something in programming is by creating an abstraction for it. This is how it would look like:

interface Step{    public function process(array $records): void;}class Result{    public readonly array $record;    private ?Throwable $error = null;    public function __construct(array $record)    {        $this->record = $record;    }    public function withError(Throwable $e): void    {        $this->error = $e;    }    public function isSuccess(): bool    {        $this->error === null;    }    public function getError(): Throwable    {        if ($this->error === null) {            throw new RuntimeException('No error');        }        return $this->error;    }}class Pipeline implements Step{    private Step $step;    public function __construct(Step $step)    {        $this->step = $step;    }    /**     * @param Traversable $source     * @return Generator     */    public function run(Traversable $source): Generator    {        foreach ($source as $record) {            $result = new Result($record);            try {                $this->process($record);            } catch (Throwable $e) {                $result->withError($e);            }            yield $result;        }    }    public function process(array $record): void    {        $this->step->process($record);    }}

So, let's enumerate what we have done here:

By accepting any Traversable that returns array on every iteration, we have abstracted away the source. Now, we don't care if the source is JSON, XML, or any other. We are going to iterate over a collection of arrays. It is up to the source class to fetch the corresponding records.
By creating an interface called Step we can now abstract away the action we want to perform in every record, and we can create implementations that will do common tasks.
By creating the Result object we can report whether every record was successfully processed or not, and what was the error.
By using Generator, we can process any data-set of any size without worrying about memory consumption.

If you are thinking "Great, this is five times more lines than the previous approach and it does not even contain the code for reading JSON" you are looking at the problem from the wrong angle. The problem in the first place was not that the code was hard to grasp. In fact, anyone would understand what the previous code did. The problem was that it was not resistant to change. Making code more robust, code that isolates the things that change, by definition is going to take more lines of code. The price you pay is more lines of code, but the benefit you get is immensely superior. I would say that is a pretty good tradeoff.

Moreover, is not always true that abstraction requires more code. Designing abstractions is very similar, in a way, to search algorithms. In searching algorithms, if your list is small, you can use O(n) approaches and don't even worry. However, as your list grows, using a O(log n) approach will make a difference. In abstractions, you don't feel the benefit of it if your codebase is small, but as your use cases grow and the abstraction proves itself, it requires exponentially less code to do more things with that abstraction.

Conclusion

What I've given you here is really the Single Responsibility Principle. Many people believe the principle is "A class should have only one responsibility" or "A module should do only one thing". That's technically incorrect. The original wording of the principle given by Uncle Bob was that "a class or module should have only one reason to change". By isolating the things that change, our pipeline has just one reason to change: the change in requirements. The Pipeline class does not change its logic if the data source changes, or if the steps change: it will always remain the same.

Isolate the things that change behind abstractions, so those things can change freely and leave the rest of your code untouched. That will make your code more robust, and resistant to change.

My Top 3 PHP Naming Pet Peeves

Matías Navarro-Carter — Thu, 26 Jan 2023 09:20:53 GMT

I've been writing PHP for quite a while now, but I do not write PHP the same way as when I started. In the beginning, like every developer, I just started by mimicking what other developers did: no surprise there. But along the way, I started to question why I was doing certain things that way and realised that some of the things I did when writing PHP were completely unnecessary.

One of those is naming things. There are a whole lot of assumptions I made when naming stuff. Now I do things differently and have become a bit vocal about it, to the point that I'm actively pushing people to not do these things. It is not that it annoys me -- the title is just clickbait, people! -- but rather that we all should be quicker at honing the most important skill of a software engineer: to question ourselves constantly about why we do things a certain way, and to find better ways of doing them.

And just in case you are wondering, I have way more pet peeves than these three, but these are the ones I could think of related to naming, because:

There are only two hard things in Computer Science: cache invalidation and naming things.
~ Phil Karlton

Interface Suffix

This one is probably one of the ones I don't really understand, but it is so embedded into the PHP community that getting rid of it is going to be almost impossible. I'm talking about the practice of suffixing every interface definition with the word Interface.

First of all, why? It is not in any PSR or official PHP standard, nor it is explicitly endorsed by anyone I know. It is not that we suffix our classes with the word Class either. It is just simply done, by everyone.

Some people say that is a practice taken from Java, and I can see that but that does not really explain why Java people started doing it in the first place. Some argue that this is so you can easily distinguish what types are an interface, and I ask, in what context? Your IDE puts different icons on the files accordingly in the file browser. If you are looking at a class and you see implements next to the type, you know is an interface. And if you are looking at the FQCN in the code, a simple hover tells you what it is. I can understand the practice as valuable in days where IDEs where not a thing. But in this day an age, we have a million ways to know what types are an interface.

But maybe some of you don't like change, and are asking, why not? Well, a strong reason is that those are 9 extra characters that you don't need to type and that don't need to be polluting your source code. Take a look at the definition of the Serializer class of Symfony Serializer (@1e69e2f):

class Serializer implements SerializerInterface, ContextAwareNormalizerInterface, ContextAwareDenormalizerInterface, ContextAwareEncoderInterface, ContextAwareDecoderInterface

Now, take a look at the simplified definition, without suffixing:

class Serializer implements Serializer, ContextAwareNormalizer, ContextAwareDenormalizer, ContextAwareEncoder, ContextAwareDecoder

Much shorter huh? We have removed 45 characters from that line, which is a ton considering that is more than half of the recommended characters per line (80) -- not that I agree with that recommendation though. And not only in this source file but on every single file that references the interface -- and interfaces are referenced a lot! -- But the bottom line is that less unnecessary pollution in your source code is always going to be a good thing!

If you are a good observer, you'll realise that my suggestion leaves the code in an incorrect state. We have a name clash. We are defining a class called Serializer, that implements an interface called exactly the same. You might think this is a drawback, but it is not. You see, one of the benefits of having an interface without a suffix is that now the implementations can describe better what they do. I would rename the Serializer class to DefaultSerializer or MainSerializer to indicate that this is sort of the class that glues the whole library together. This would be the final result:

class DefaultSerializer implements Serializer, ContextAwareNormalizer, ContextAwareDenormalizer, ContextAwareEncoder, ContextAwareDecoder

Think about this: the interface is an abstract thing, and therefore it makes sense it has the purest, unsuffixed name of what it is representing or abstracting. Concrete implementations should be the ones prefixed with what kind of implementation are they providing and its usually tied to their dependencies. Here are some examples:

Interface	Implementations
`MessageQueue`	`RabbitMessageQueue`, `SqsMessageQueue`, `RedisMessageQueue`, `FilesystemMessageQueue`
`KeyValueStore`	`RedisKeyValueStore`, `NatsJetstreamKeyValueStore`, `EtcdKeyValueStore`
`IdentityProvider`	`PdoIdentityProvider`, `DoctrineIdentityProvider`, `OauthIdentityProvider`

Examples of interfaces names and implementations

I hope this is reason enough for you to start dumping Interface as a suffix for your interface names. Same goes for Abstract prefixing abstract classes or Trait suffixing traits.

Naming after Patterns

Another naming related. Similar to the one before, this refers to the bad practice of suffixing or prefixing class names with the name of the pattern you are implementing in that type. It fills your codebase with class names called LoggerHttpTransportDecorator, CompositeIdentityProvider, IdentityProviderAdapter or EventDispatcherSingleton. I mean, I know you know your design patterns -- as you should! -- but come on, you don't have to shove it on everyone's face! Do I need to know if that specific class is a decorator, an adapter or a singleton? Not really!

Don't get me wrong: you must know how to identify whether a particular class is implementing a certain pattern just by glancing at it. But you don't need to name them like that. It's just extra unnecessary noise to the source code.

This happens not with just the classical design patterns, but with objects describing their role, like MoneyValueObject, RestRequestTransfer, CustomerDTO or UserEntity. Just don't do it. You don't need it. The fact that a class is a DTO should have nothing to do with its name. Naming should be short and clear. If the name is ambiguous, then the namespace is there to provide context. Remember that a class name is its full name, not just the last bit after the \. This brings me to my next point.

Superfluous Namespacing

I've written about this before. Namespaces' role is to prevent name collisions, not create taxonomies. Of course, because of code organisation, you will still have some degree of taxonomisation, but you should still strive to keep things short. There are two common mistakes here. One is using namespaces to describe what things inside them are, and another is abusing them by repeating words or concepts inside them.

The first manifests when you have a namespace called Exceptions, Interfaces, Components, etc. This is a very common way of organising namespaces that people use when working on a project in a framework. I'm more of the opinion that you should use namespaces to group things under the same concern, and not to group things under the same role in a codebase.

For instance, compare these two directory structures:

App/ Controllers/    UserController.php    RoleController.php    TokenController.php    PaymentController.php    OrderController.php Entities/    User.php      Role.php    Token.php    Payment.php    Order.php Exceptions/     UserNotFound.php     TokenNotFound.php     ControllerError.php     PaymentError.php

App/ Security/    UserController.php    RoleController.php    TokenController.php    User.php    Role.php    Token.php    UserNotFound.php    TokenNotFound.php Checkout/    PaymentController.php    OrderController.php    Payment.php    Order.php    PaymentError.php ControllerError.php

The first iteration uses namespaces as a way of categorising by role. In other words, we put all the controllers in a single place, all the entities in a single place, etc. This makes it very hard to determine dependencies and is a very framework-based way of organising namespaces.

The second iteration is categorised by concern. We put all the stuff related to checkout together, and then all the things related to security. Things that are particular to everything, go a level under -- like the controller error. It is much easier to track dependencies here.

Another way of misusing namespaces is to make them too long by repeating stuff in them. Take, for instance, one of my favourite examples from the Laravel codebase: Illuminate\Broadcasting\Broadcasters\Broadcaster. That is a lot of repetition! This interface could very well have been named Illuminate\Broadcaster. You don't need all the taxonomy fluff. If a namespace repeats something, then it is superfluous. Once again you must pay attention to how you are breaking down your code.

Messaging Guarantees

Matías Navarro-Carter — Tue, 22 Nov 2022 09:53:00 GMT

The Two Generals Problem

Lets suppose you are a general in command of a medieval army, and you find yourself in the following situation.

You must siege and capture an enemy settlement. The only way to do it successfully is to coordinate a simultaneous attack along with the forces of another general on the opposite side of the settlement, so you can attack from both flanks. You need to devise a way of coordinating such attack.

You both agree that you will send a message to your fellow general at the time of the attack. And to be sure that the messenger has not been intercepted on its way and the message has made it safe to your fellow general, he will send another message back acknowledging receipt of the first one.

You are about to sign off on the plan, but then your fellow general asks: How would I know that the confirmation has arrived to you?. Hes worried, and with good reason. This is because if the messenger carrying the confirmation is intercepted, he would find himself attacking the settlement alone with the prospect of a terrible defeat. That is not a good position to be in!

He proposes that you send an acknowledgement of the acknowledgement, to prevent him from attacking by himself. You are about to agree with his reasonable request, but then you realise something: Wait a minute! What would happen if my acknowledgement is intercepted? Then you wont do anything and Ill be the one attacking by myself!

The generals have a problem. It should be fairly obvious by now that stacking acknowledgements is not going to help anyone. This problem is impossible to solve. There is no way they can devise a mechanism by which they can coordinate reliably because the medium of their message is fallible.

The Impossibility of Exactly-Once Delivery

This story is fundamental to understanding distributed systems and their challenges. The moral of the story is that is impossible to guarantee the delivery of a message via unreliable transport, no matter the technique we use (like acknowledgements). The internet protocol is an unreliable transport, since connections can be broken, faulty or temporarily lost. Packet loss is an unavoidable consequence.

This has tremendous implications for our applications built on top of TCP or HTTP. For instance, sending a message to a queue can fail. How we handle this failure is crucial to our system architecture.

One option is to retry sending the message. The problem with retrying though is that the effects of the operation being retried could be applied twice or more. For example, if I publish a message to a queue but the publish fails because I didnt receive any ACK from the message queue, what was the problem? Was my message lost trying to reach the server or was the acknowledgement of the server lost on its way to me?

Remember the two generals' problem? If the acknowledgement was lost on its way to me, then the queue has already processed my message. If I retry the operation, the queue could receive the same message again and the effects of that action could run twice and that better not be charging a customer! There is no way for me, the producer of messages, to know for sure if the server has effectively processed my requirement or not.

Message Delivery Guarantees

Most people dont try to solve this problem and are happy with offering their customers the easiest and weakest of message delivery guarantees: at-most-once delivery. This means, that you will send the request only once, but if it fails, you wont retry. It may or may not be sent once, but you can be sure it wont be more than once.

When message loss is unacceptable, exponential backoffs with retry logic are implemented. But as soon as you retry, you are embracing an at-least-once delivery guarantee. This means that you will keep trying to send this message until you receive an acknowledgement, but that might send the message more than once if an acknowledgement is lost on its way back to you, the producer.

Bottom line, you can either choose between at-most-once or at-least-once message delivery guarantees, but exactly-once is a technical impossibility.

If you have chosen to support an at-most-once guarantee, you can relax and sit in your nice home office setup in peace. If loss of messages is unacceptable for you and you need to support an at-least-once model, but dont mind receiving duplicates of messages, then you can get away with exponential retry logic with no problem and still sit pretty. But if duplicates are not an acceptable trade-off either (ejem! payments) then you are in for some fun problem to solve.

Looking at TCP: Exactly-Once Processing

So, is there anything we can do to deal with these unavoidable duplicates? We should look a lower level protocols to rescue some ancient ideas to deal with this.

If you know about internet protocols a bit you will know that TCP is an acknowledgement-based protocol. Every packet sent by the client needs to be acknowledged by the server. This brings us back again to The Two Generals problem. What happens if an acknowledgement is not received by the client? In the case of TCP, then the client sends the same packet again.

But wait!? What if that acknowledgement was lost on its way back to the client and the server did receive the message? Well, simple: the server will receive a duplicate message. It will receive the same packet again.

So, how does TCP deals with this? We could say that simplifying the actual inner workings of TCP this problem is solved by using state: the server stores the ids of the packets it has seen for that connection and simply ignores and skips any duplicates.

This is actually an oversimplification of the way TCP works and deals with deduplication and ordering. For an exact explanation, you might want to google two key concepts: the Transmission Control Block and Sliding Window on TCP.

Looking at TCP we can actually find a solution to the exactly-once delivery impossibility. We need a way to identify the same message in the context of a retry, keep a list of the seen ids and discard the ones that have been seen already. This technique is commonly known in web services as idempotency. Some other people call it deduplication. I like to call it exactly-once processing.

It requires the client to assign a unique identifier sometimes called the Idempotency key to every message. But it also requires the server to track which ids have been processed so they can be ignored if sent twice. Using this technique is possible to process a message once and only once and avoid the effects of an operation being accidentally executed twice.

Of course, there must be a limited window of time the server has to store the tracked ids (since it would be wasteful and inefficient to store every possible seen id) and if you have multiple servers processing the message, it also has to be a centralised form of storage.

Conclusion

Distributed systems are composed of a group of processes that need coordination and whose sole way of communication is via the network. But as the network is fallible, is impossible to guarantee that a message will be delivered successfully. Understanding this problem, its tradeoffs and how to mitigate it is essential when designing and developing distributed systems.

Codebases are Pets, Not Cattle

Matías Navarro-Carter — Fri, 11 Nov 2022 10:31:45 GMT

If you have been around in Software Engineering for the last 5 years, then you have been part of the wider DevOps revolution and all the things that came with it. It is also likely you have heard one of the most famous aphorisms in the DevOps culture: that you should treat your services like cattle, not pets.

I live in Northern Ireland, so I know a few farmers here and there. In fact, I helped John (a family friend) milk his cows at his farm in mid-Ulster a few times back in 2015 and 2017. I had so much fun doing it and I learned a lot about the process, and it helped me understand the DevOps cattle and pets metaphor in very real terms.

For all the purposes of maintaining the cattle on his farm, milk farmers like John treat their cows in batches, with no special relationship to any single one of them. They have tooling and processes in place for cleaning them, vaccinating them, milking them and breeding them, and those involve a big chunk of them at the same time. No cow gets special attention or treatment -- unless they are sick or pregnant -- but even then there is some sort of emotional detachment.

Think about pets now. We name them, play with them, take them on walks or holidays, feature them in our insta and, if you are like me, you probably speak to them as if they were a 5-year-old who speaks your language. My wife and I have such a strong bond with our cat Pua, that we brought her from Chile in the middle of a global pandemic when we moved to the UK.

Back to DevOps world. The metaphor means that in scaled software organizations with a bunch of microservices running there is no time to give each process the special "care" and "attention" they need. You don't deploy them manually, and if they fail you certainly do not restart them manually, and if they need maintenance you do not run the scripts and tasks needed manually. There is no time for that when you have too many services. All those things should be done automatically. Those running processes should be disposable, easily restartable and stateless, among many other things.

But, I'm afraid that this very much-needed shift in managing services has just somehow transpired to managing codebases when it should not be that way. You see, a service is a running process or workload in a server. A codebase is source code in the form of readable plain text. They are very different things.

Codebases need care. They are pets, not cattle. They need all the attention and care you can give them. You have to name them properly, care for them individually, fix them if they are broken, cure them if they are rotting, and upgrade them if they are outdated. You need to design them properly, document them well, write their lines clearly, and structure them simply. Codebases are like that high-maintenance girlfriend, and you should be that loving and caring partner to them. The compiled runtime product that comes from a codebase -- the running process -- is a whole different story: it should be disposable.

Of course, some of the tasks related to a codebase's maintenance can still be automated, but the love and tender care they need is there nonetheless. I still love my cat even though I automate her feeding twice a day with the exact amount of food she needs with this amazing automatic cat feeder that I recommend everyone who has a cat to get. Codebases are like that too. Although we automate their build, testing and deployment, and maybe some routines in development, every single one of them needs a slightly different CI pipeline, test suite, dependable configuration or Makefile targets.

I think the industry has missed this important distinction, and good things and practices like incremental refactoring, good application of software design principles, robust testing suites, flawless development experience and strategic upgrading paths are at risk of being disregarded even more than they are today. They are slowly being replaced by inflexible templated approaches and blueprints or all-invasive frameworks that try to abstract away the tedious parts or make up for the lack of design. V2 or v3 rollouts of the same service or thing are frequent in this kind of environment, because now the codebase is seen as cattle, and something quite disposable. Sooner or later an explosion of codebases is added, on top of an explosion of services. Is easy to automate managing services and workloads -- thank you K8S -- but you cannot really manage or grow codebases automatically -- at least not today. So, a growing number of codebases is a real scaling problem you don't want to have.

Source code is the biggest asset of a software company -- no matter how much low-code advocates rant about the contrary. We should treat our code how it deserves: with the special care of a pet, and not with the disregard for cattle. Clean code scales; you just need to invest the time and care to produce it.

Globals are bad, but!

Matías Navarro-Carter — Sat, 05 Nov 2022 10:44:53 GMT

When I started writing PHP and learning my first OOP concepts and patterns, I remember being dazzled by the singleton pattern. I used it so much! I think I fell in love with its convenience. Just make something a singleton, and you can grab it and use it from anywhere you want.

I'm sure you know how they work already. The classical implementation is to make a static property and method called instance, and if the property is not null, return it. Otherwise, create the instance and store it in the property, so then you can return it.

class Singleton{    private static Singleton $instance = null;    public static function instance(): Singleton    {        if (self::$instance === null) {             self::$instance = new Singleton();        }        return self::$instance;    }}

You can optionally make __clone and __construct private, to prevent more than one instance can exist.

Because You Can Doesn't Mean You Should

Sometimes we developers refer to code like that as globals. Is a short way of saying globally accessible. By globally we mean, from anywhere in the source code. The singleton above can be accessed from anywhere because of its static method. Convenient, isn't it?

My story with singletons is like the story when you find a really cool album or band, and put it on repeat forever, every day -- yes, I do that. It is good for a while, but that sentiment quickly evaporates and suddenly you don't find it as good anymore.

For me, that sentiment came when I started to learn about testing, and writing my first tests. I still remember how hard it was to write my first tests because I had made everything globally accessible. It was impossible to test the controller without a real database connection, third-party API calls and a queue. I gave up pretty quickly.

Eventually, you realise that when something is convenient in programming, it is so at the cost of making other things harder. Singletons make access to other routines in code very convenient, at the cost of an extreme coupling that makes it unbearably hard to test your code units in isolation.

Be careful. Some people like to promote things like these by proclaiming them as simple. Don't be deceived! In engineering, is very common to see simplicity disguised as her shallow cousin: convenience.

After a while, I gave up using globally accessible stuff in my code and moved on to better patterns, like Dependency Injection. Just because you can globally access something from anywhere does not mean you should.

Because You Shouldn't Doesn't Mean You Won't

But, here is the grain of salt. I do believe singletons and other kinds of globals have their place in a codebase. I think that place is the bootstrapping code.

You can call all singletons you like, as long as you pass them as arguments in the constructor for another class, for instance. In the code that bootstraps your application, you can access all kinds of globals and that is fine because the bootstrapping code is the code that is coupled to all the dependencies of your program. As long as those globals don't leak anything to the rest of the application code, you'll be perfectly fine.

I've found myself writing more Singletons than before because they are more convenient in bootstrapping code. As a library author, you can provide Singletons with sensible defaults for your users, while still giving them the possibility of bootstrapping their instance with custom values.

Overall, I think we need to get better at testing. Promoting testing everywhere will only do good to the PHP community because developers will quickly realise when they are writing hard-to-test code.

HELM vs Kustomize

Matías Navarro-Carter — Sun, 21 Aug 2022 09:51:38 GMT

Not too long after Kubernetes became popular, it became evident rather quickly that maintaining all the manifests involved in the deployment of a single application was going to be a cumbersome process. Having a different set of manifests for every environment you want to deploy to is simply unacceptable. People quickly realised that applying some DRY to the problem was very much needed.

Thats how HELM was born. The idea of HELM was to create highly-configurable packages that would contain all the dependencies of a particular application. So, if you wanted to deploy your own self-managed instance of Gitlab, the HELM chart for it would contain everything you needed: the database, the main application, the runners, the exporter, Gitaly, etc. You just needed to tweak some configuration values and that was it. Configurable and DRY Kubernetes was born.

HELM achieved this by making Kubernetes manifests a template (specifically, a Go template) and interpolating variables to render the actual manifests to be used by Kubernetes.

Quickly, we realized that there are several issues with this approach.

Increased learning curve

The first evident problem is that you are introducing a layer of abstraction on top of another layer of abstraction -- this is how Software works anyway. Now, you need to understand not only the something you are abstracting (in this case, k8s and manifests, which are quite full of learning already) but also the abstraction itself and its inner workings.

New concepts are introduced like Charts and Repositories. Now you need to learn how a chart is structured so you can build your own and then learn Golang template syntax if you dont know it already. All these things, although not a major burden, hinder adoption by steeping the learning curve a bit.

Templating limits customisation

It should be fairly obvious that, in any solution to this problem that involves templating, only the values that are parametrised are the values that you can configure. Unfortunately, you cant change the rest. Since every organisation or team has different needs, everyone needs to customise different aspects of the Chart. But again, the chart has to be parameterised for that specific thing you need to configure. In other words, if you want full control, a chart should parametrise every single possible yaml value, which is almost impossible and also very impractical. Suddenly, HELM charts that could have been simple K8s manifests become something unbearable to read.

This problem above is why you have like 20 different HELM charts per application. Everyone needs to customise different things or do things slightly differently.

Charts make choices for you

Because every chart has to contain all the units to make an application work, and because you cannot parametrise everything, then unavoidably some choices are made for you. Some charts deploy a Postgres instance using a StatefulSet with the official Postgres image, but your setup needs probably a PostgresCluster resource coming from the great Crunchydata Postgres Operator. Or maybe you have a database already available in Aurora and dont need the chart to create one for you, but you cannot disable it.

Again, the impossibility to control these and other aspects of a chart is what leads to a chart explosion for the same applications. This for me is the strongest statement of the failure of HELM to solve the DRY problem Kubernetes Manifests have. It has created more complexity.

Kustomize: A better alternative

The Kubernetes team understood this quickly and moved to embrace a better approach natively inside kubectl. The kustomize tool proved to be simpler, more transparent and more effective at solving this problem than HELM.

The paradigm behind kustomize is that, instead of having a manifest template, we would have a normal Kubernetes manifest with some sensible defaults that we would use as a base, and then, in another manifest called overlay, we would apply operations to the base manifest to add, remove or change its values, or whole nodes of information if we so desire.

Kustomize does this by the use of common standards like JSON Path and JSON Patch.

This change of paradigm (towards composition) has several advantages over a templated approach.

First, base manifests do not need to contain every possible configuration and its parametrised form. In fact, the shorter they are, the better. Second, overlays only change and touch what is needed, which also keeps them thin. And third, overlays can be stacked one on top of the other to increase reusability.

Another benefit is that you dont need another binary to use Kustomize. It is supported natively by using the -k flag on kubectl apply or the kustomize subcommand in kubectl.

Kustomize drawbacks

Not all that shines is gold. There are still some drawbacks to this approach, as with everything in engineering.

For starters, you need to learn how Kustomize reads your kustomization.yml files placed in your manifests, but also learn how JSON-Path and JSON-Patch work. So there is a cognitive load that increases the learning curve.

Having said that, this is much easier to make sense of than HELM. JSON Path is a very simple spec, as Json Patch is too -- you can learn both in less than a day. Also, only engineers involved in customisation need to know the advanced stuff; the application engineers can write their Kubernetes manifests as they see fit aided by the Kubernetes documentation. This will be overridden by the CD tool using Kustomize. This closes the gap between Development & Operations (SRE), furthering the DevOps culture across the organisation.

Final Thoughts

I greatly encourage you to give it a go, especially if you have a HomeLab Kubernetes cluster. I recently ported all stuff in my cluster from Helm to Kustomize and Im enjoying how much simpler my cluster git repository is.

Considerations before jumping into Microservices

Matías Navarro-Carter — Thu, 05 May 2022 10:06:54 GMT

Disclaimer: A few years ago I wrote this piece. It came out very passionately then due to frustrations experienced in a microservices project. I have matured and grown since then, and in many ways, this article is a more nuanced refinement of those ideas and an attempt to correct my misconceptions at the time.

Knowing the Monolith

I think before I say anything about this topic, a proper definition of the concept of a monolith is needed. A monolith is usually defined as a piece of software (usually in a single codebase) that has grown so complex and has accumulated so much technical debt over the years that now is extremely hard to change, hard to understand and therefore easy to break. Developers dread the day they have to make a change or implement a new feature, and when they do, unexpected things happen all over the place. The system is so big, so complex and so glued together, that no one knows where to look to fix the issues. This in turn leads to poor development times, a lot of Im sorry that cant be done and tons of burned-out engineers due to the pressure for delivering and the frustrations and fear of unexpected issues. In terms of the business, the clients trust is harmed because of the many issues and poor response times.

There are mainly three major things characteristic of monoliths, and their causes are usually related to violations of best practices in OOP.

Monoliths are rigid, which means, they are hard to change. The reason why they are hard to change is because of coupling. Usually, this manifests itself in not abstracting away the details that change (a violation of the Dependency Inversion principle), sticking all the logic in a single place (a violation of the Single Responsibility principle), or because almost every modification requires changing existing code instead of adding a new one (a violation of the Open-Closed principle).

You can get a measure of the rigidness of the system using automated tools that perform code analysis, like cyclomatic complexity, code duplication, test coverage, number of lines per class or module, file churning, ABC, etc. But these metrics, even when you have them, need proper interpretation by a human, preferably a consultant that can analyze the codebase for things that cannot be picked up by code analysis tools, like SOLID violations or lack of use/misuse of design patterns. Some of these are also measures of our next point.

Monoliths are complex, which means, they are hard to understand. It might be due to their size, or maybe because information about the system has been lost (lack of documentation), or maybe because the code is written in a very dirty way. Sometimes, in the life of a system, so-called shortcuts are done to overcome a design limitation, and the people that implemented them are long gone, and their knowledge of the system shortcut is gone with them. Maybe the system is complex because it does things unconventionally. For instance, I have seen many systems myself that, when you change something (say, add a field to a payload) you need to do something else somewhere else on the system so that the change can behave properly. The problem is you need to know that.

There are many, many reasons for complexity. One of the best ways to measure complexity is how long would take a new engineer to get the system up and running in their development machine, get a grasp of it, understand where the critical parts are and be able to make meaningful changes without much aid from anyone else. Of course, that is also a measure impacted by the engineers experience. There is also, as I mentioned before, automated tool analysis.

Monoliths are fragile, which means, they are easy to break. Due to all the reasons mentioned above, the system breaks pretty easily, and for the weirdest of reasons. Even skilled engineers fail into their traps and introduce unexpected and accidental side-effects or bugs. Again, this is probably due to the inherent coupling of the components of the system, plus its complexity of it.

You can measure fragility by getting a sense of how afraid are developers, even experienced ones, of making any change to the system. Also, by how many regressions the release of a new functionality causes. By using static analysis tools you can pick potential sources of breakage due to typing too.

Monoliths are a sad tragedy to be in. We wanted a flexible, simple and robust system, but somehow we ended up with a rigid, complex and fragile one.

The Principal Architect

So, we have enumerated the issues with monoliths. They are rigid, complex and fragile. Our organisation is struggling to deliver and customer confidence in our capabilities of delivering a quality product is decreasing dramatically.

But, fear not! We have identified our problems to solve! So the Principal Architect at the company says:

Lets come up with a strategy to incrementally refactor our monolith to make it more flexible, simple and robust.
~ The Wise Principal Architect 🦫

A good approach. Not so fast and requires skilled engineers in the art of refactoring plus the ones who know the system well. But it can be done for sure. Or maybe it cant; maybe you are past the point of refactoring (in terms of cost) or you dont have surgeon engineers at your disposal, so maybe they say:

Lets take all the knowledge we have so far and build a second version of the system that is more flexible, simple and robust.
~ The Bold Principal Architect 🦅

A bolder approach, but faster than the former. You can do this with mid-skilled engineers under the supervision of the ones who know the system well. You need to put extra effort into not repeating the mistakes of the first system though, otherwise, your effort is pure waste.

However, you know they dont say that, dont you? Thats why you are reading this blog post! What they say is:

It is time for us to split our monolith into microservices like Netflix or others are doing!
~ The Foolish Principal Architect 🐥

If you are there when that happens...

On a more serious note, I think this comes to pass because we have a misconception of what microservices are and therefore, we are unable to see the real problem.

Misconceptions on Microservices

There are a lot of misconceptions about microservices. They are usually hard to spot because there are some elements of truth in them (hence the name, misconceptions).

One of the most famous ones is that they are a way of splitting the monolith. This is quite misleading because suggests the main problem with a monolith is its size. If you add that to the fact of how the pattern is named (micro-services) then you understand the cause of confusion. People bring the reasonings of SOA and apply them to this problem. Thus, the following way of thinking becomes the common argument for implementing this pattern:

If we split this big thing into smaller chunks, then it would be easier to manage.
~ An Engineer. Famous Last Words. Circa 2018.

This looks reasonable, and it is indeed (in the appropriate contexts). This is the principle of modularity: separating things into smaller components that then can be combined to form a larger whole. The idea is that at any time you can pull out one component and replace it with another without breaking the whole system. And it is also easier to make sense of smaller pieces than a big ones, right? So far, so good.

In pre-microservices SOA, this separation usually fell into the runtime space. This means these modules interacted between them in the safe realm of random access memory. Changes would be made in a single place (the application code) and then shipped in a single executable without major hassle. Adapting to changes in the public API of one module usually involved correcting the consuming services in the same PR. This is one of the victories of object-oriented programming.

However, as we will see further, there is a whole range of complexity introduced when we apply this principle to services running as separate applications communicating over TCP networks. Ill get there in the next section. I want to stress something else first.

Our long introduction on what a monolith is had a very special purpose that will come to light now. If you look again at the reasoning justifying the splitting of the monolith into smaller services, hopefully, you will realise that fundamental problems somehow escaped the reasoning backing the proposed solution. This is extremely important.

True, size might be a factor in the complexity of a system, but it is not the most important factor, and certainly not the only one. It is not just complexity due to size, but rather any complexity, anything that causes rigidness, and anything that makes things fragile. If you have complexity, fragility and rigidness, it does not matter if you have one big large service or many small ones. You will suffer anyways.

Estimate the Cost. Sit down and consider.

Actually, no, I lied. It does matter. Is better to have a monolith than a distributed mess. Here is the big takeaway of this article: if you implement microservices for the wrong reason and if you are not ready to tackle the challenges that come with them, then you will be in a worse place than when you started.

As usual, the scriptures bring untimely wisdom in this matter (and in any other matter regarding life really). Jesus is trying to explain to people that following him is very demanding and that they should consider whether they can do it or not before blindly jumping on the bandwagon (lets remember these are the popular times of Jesus ministry). He uses two wonderful metaphors. Here is what he says:

Suppose one of you wants to build a tower. Wont you first sit down and estimate the cost to see if you have enough money to complete it? For if you lay the foundation and are not able to finish it, everyone who sees it will ridicule you, saying, This person began to build and wasnt able to finish. Or suppose a king is about to go to war against another king. Wont he first sit down and consider whether he is able with ten thousand men to oppose the one coming against him with twenty thousand? If he is not able, he will send a delegation while the other is still a long way off and will ask for terms of peace. In the same way, those of you who do not give up everything you have cannot be my disciples.
~ The Holy Bible. Luke 14:28-33 (emphasis mine)

We are not builders nor kings, and maybe you do not follow Jesus, but surely we both can relate to this. Prudence and consideration is a scarce gift in a world that encourages you to take any opportunity you have in front of you.

The first thing you should think is, do you have the resources to maintain all those microservices? Is hard enough to maintain one service, let alone fifty. You need to have the workforce in place for that effort. Otherwise, dont do it.

Of course, maintaining involves many things. For instance, each part of the development lifecycle needs to be properly automated. You cannot test all of your microservices manually. You need to rely on automated testing and quality checking (Continuous Integration) and try to also automate the release process (Continuous Deployment). If you dont do that, better not to jump in the effort. You will be crushed under the weight of manual deployment and testing work. You cant treat your services like pets anymore, giving them all the care and attention they need individually: now they should be treated as cattle.

Some companies develop their own semi-automated release process. It is automated up until the last mile after which there is some manual work to be done for release. This is another challenge: making sure you can come up with a process that is simple to learn and properly documented, so it can be quickly grasped by any engineer in the company. If you fail to do that, there will be trouble.

If keeping one application up to date with the latest framework, language version, or standard is hard enough, imagine doing that for thirty, or forty. It is not fun. True, there are things like dependable or language-specific tooling to upgrade from one version to another, but yet is another automated process you have to implement, maintain, document and ensure that runs smoothly.

Things get tricky if you implement custom, in-house tooling, frameworks or libraries to accelerate the development effort. The reason why you have to accelerate it though is because you brought that extra work when you decided to bring microservices in. Every new project implies a setup cost that you dont have to pay if the project is already there and the tooling is in place. No mention, you have to keep that tooling nice, sharp and well documented and make sure every project is using an updated version. And dont dare to break BC!

Dealing with the data dichotomy is also something a lot of teams dont consider. When you had the monolith, it was all there in the same database, so it was easy to share data between services. But now (hopefully) every service has its own database and a very narrow REST interface. But they still have to share data. How do we do that? If you just create a network of interconnected REST calls, you will end up with a very complex system, and changes in one place could and will break stuff in other places. Services will be coupled together and you will end up with a distributed monolith. Sure, you can implement observability systems to detect when things break, but that is half of the problem. The problem is that now the other affected service is in another project, in another repo (maybe in another language!), that you need to clone, set up and understand before you can make a fix there. In the monolith, at least, it was in the same source code.

Maybe you did your homework and decided you will communicate your microservices using an event-driven approach, but that has also challenges and pitfalls of its own. Tracing where an event goes and what services reacted to it is hard, and it requires a particular set of skills and engineers with the right experience to develop event-driven systems. They are not easy to find, and if you do find them, they are not cheap to hire.

The ability now of going polyglot is another factor to consider. Now you can have 2, 3 or even 4 languages with a presence in your company. There is a constant tooling and language war over which one should we use. Tooling has to be built in 2, 3 or 4 different languages. More sooner than later you will end up feeling the huge maintenance burden they cause.

If the technical challenges are big, the people's challenges are monumental. Suddenly your company has hired twice the people to be able to cope with the extra amount of work you have brought into yourself. New people, with new experiences and new ideas, come to the table. Sometimes they could bring change just for the sake of changing something. For instance, someone had a bad experience with GraphQL on their previous job, or someone likes Postgres over MySQL or Golang over PHP (imagine that! 😜). More people means really, more diversity of experiences and opinions. And while that is a good thing, you still need to be able to filter what makes sense from what is just pure noise.

To keep this under control, you implement a process heavy-culture that focuses a lot on uniformity and consistency, so your company does not become the wild west. Although you gain in that area, process overloading brings problems of its own: new engineers have to learn your processes, and they might be not the best choice sometimes or fall outdated pretty quickly. Also, processes need to be properly documented and simple. Otherwise, they will not be followed. More extra work!

Maybe your monolith was maintained by one or two teams, but now you have 7 teams working on your suite of microservices. Communication is suddenly more important than ever, and it is even more if your services are coupled with direct request-response calls. One teams work can break the others, or one team can be blocked waiting for the implementation of another feature by another team. Sitting time is great for engineers: they can read Reddit, or Twitter, or maybe write long boring articles like this one. It is not great for your company though: you are losing money.

Honestly, I could go on and on with more and more things you need to consider before you make this move, and I still will have the stuff to say. There is abundant literature and media out there that deals with the pitfalls of microservices, so make sure to take a look at it! I leave you with my personal favourite here:

%[INVALID_URL]

Final words

Maybe you read all this and say: but all those things are solvable! And yes, you are correct, they are! Every single one of those problems has a solution and can be addressed effectively. But if you are asking that I think you might be missing the point. The question is not whether they are solvable. The question is, do you need to solve them? Do you need to bring all that complexity in to have a successful business? Do you need to add another suite of technical problems to solve on top of your business problems? Do you really need microservices, or do you only want them?

Dont get me wrong. The benefits of microservices are great. But you only reap the benefits if they are correctly implemented and you have what it takes to do so. If not, again, you could end up worse than before you started.

The bottom line is to estimate the cost, sit down and consider, and formulate a plan for your transition (if you really need to transition!). If you fail to plan, you plan to fail.

The Problems with Soft Deletes

Matías Navarro-Carter — Mon, 11 Apr 2022 10:42:55 GMT

If you have been in web development for a while you might have stumbled upon the concept of soft deletes. It is a common pattern (or more exactly, anti-pattern) in database management that allows deleting items from a database while keeping the records there.

The implementation of this pattern usually involves a nullable timestamp column called deleted_at. When we delete a record, instead of a DELETE query we just simply run an UPDATE one and set that column to the current timestamp. Then, deleted records can be hidden from the exposed public API by adding a where clause like the following to every query:

SELECT FROM table WHERE deleted_at IS NULL

A challenge while implementing this is that this clause must be added globally. If you are using raw SQL queries this is hard to maintain and error-prone. But most frameworks have robust SQL abstractions that eliminate that issue by using some plugin or library.

Benefits & Drawbacks

This technique has some benefits, like the ability to restore records that have been accidentally removed, keep historical data for reporting purposes and avoid dealing with referential constraints when deleting records in relational databases. However, in engineering, not all that is shiny is gold. Although there are some benefits to this technique, there are also some important drawbacks.

The first drawback is that there is a performance penalty your database has to pay for this. Even if the deleted_at column is indexed, still, a full table scan is needed to fetch non-deleted records. This of course harms performance exponentially as your database table grows. Some people say this is the most fundamental problem with this approach. I think I tend to agree, although the impact of that drawback varies from application to application.

But, I think there are at least two more troubling issues here. One has to do with referential integrity and another one with domain modelling.

Soft Deletes obscure referential integrity

When you soft delete a record in your database, none of the foreign key constraints you have in place will be checked. This is seen as a feature for many. No foreign key constraint violation errors on my queries. Yaaay!

The problem is that foreign key constraints are there for a very important reason. Bypassing them is often unwise. Lets see a concrete example.

Lets think of a articles table and a tags table. Articles represent blog posts, and you can tag them with any tag you would like. The relationship between tags and articles is a many-to-many one, so you will model this in a article_tags pivot table. Simple.

Now lets say you go and delete a tag from the system. You get a massive Constraint Violation Error because that tag is being used in the article_tags table to tag some articles. You need to do something with those references first. Ignoring that kind of problem is never good because then you could have inconsistencies in your data model. You should first correct the references in article_tags and then delete the actual tag.

How to correct the references will depend on your use case or domain. For instance, when does it make sense to remove a tag? Maybe there is a tag called DDD and another called Domain Driven Design in your blog. You want to normalize this, and you decide that DDD will be removed and replaced with Domain Driven Design. Your delete then should look for the DDD tag in article_tags and replace them with the Domain Driven Design one, and then you proceed to delete the DDD tag from the tags table.

Or maybe you just dont want to normalize anything and just delete a tag and all of its references. In that case, you should delete the references first, and then the actual tags.

Your program must have this knowledge baked in itself because these are the concepts that power your domain, which takes us to the next topic.

Be careful to use Foreign Keys with CASCADE DELETE. They are very convenient, but they can massively obscure things and also delete records that you might not want to delete. Again, better to have the knowledge baked into your program.

Soft Deletes obscure business/domain logic

There are some things in your domain or problem space that will require careful consideration when it comes to deletion. For instance, in my current company, we offer finance to customers based on something we call a Finance Plan. A finance plan is just a set of constraints that define the details of a loan (duration, rate, minimal payment, etc). Finance plans are then referenced by our applications (these represent credit applications from customers).

So, natural question. Should we be able to delete finance plans? Not if we have created an application with it. Because the finance plan tied to an application now needs to be kept for historical reasons. We can go even further. Should we be able to modify finance plans? No, because any modification to a finance plan would be a new finance plan. We dont want to be changing the number of months of an already approved application, right? If it was approved for 12 months, we cant change that to 24.

Now, what happens when we want to decommission a finance plan? This means we want to prevent it from being displayed so applications cannot use it. We just simply take that concept and convert it into a business action, and we add a column on the database to track that (decommissioned_at). This would work very similarly to the soft delete column, but now the business action is transparent. It is not a delete, it is a decommissioned. And it is not in every entity, just in those that need it.

When we just blindly make everything soft-deletable we obscure these distinctions and encourage developers to not think about these and other similar issues. This hinders domain modelling, as we are thinking more about a CRUD approach rather that a complex domain.

Conclusion

Soft deletes might appear very convenient, but they can be quite deceitful. Not only do they hinder performance, but also obscure your domain logic and your business constraints. I would advise you to stay away from them.

Taming Incidental Complexity

Matías Navarro-Carter — Sun, 10 Apr 2022 10:19:04 GMT

Software development is a complex trade. Layers of abstraction, tooling, patterns, trade-offs, dependencies, and people, are some of the reasons behind hard things in Software development. But there is also a big source of complexity: ourselves and our poor choices.

Incidental complexity is the technical term for this kind of complexity. It is defined as anything in software that is hard but does not need to be. Its when we shoot ourselves in the foot and make our lives harder for no reason.

Almost every single project or company out there has some degree of incidental complexity in it. Ive seen plenty of it myself and also caused a good deal of it too. Ive analysed it and tried to understand its sources by talking to people, asking questions and seeking to understand the historical developments behind a system or company.

After some study, I have come up with some suggestions to avoid shooting ourselves in the foot regarding making things complex.

Avoid Coupling

Almost every single mess of a system finds itself in that state of messiness due to coupling. Coupling, in my opinion, is the most dangerous thing in Software Development. Its like a camouflaged predator waiting to strike his prey: you never see it until is too late.

Part of the reason why we dont see it is because coupling is not a bad thing in and of itself. What I mean by this is that coupling two things together is not a sufficient requirement for disaster. It is a necessary one, of course, but not enough by itself. The missing ingredient in the mix is change. Coupling only shows his nasty face when one of the things that have been coupled changes.

This makes identifying coupling something rather tricky, and you dont usually realise you are making a mistake until something is hard to change. And it may be that even when that happens, you rationalise it by blaming it on some other external factor (Oh the client keeps changing the requirements). But truth is that we have brought that upon ourselves by making our software hard to change.

When you are building something, at all times you need to ask yourself the question: Is this coupled to something? and if it is, Is this thing likely to change?", and it if is, How costly would it be to change it? Those are some of the most important questions you can ask yourself when you are building something.

Whether you couple your application to a particular database engine (Im talking of you, Active Record), couple your code by using inheritance instead of composition or couple your suite of microservices using REST instead of an event-driven approach, you are going to have a hard time when things have to change. And if things change, you will pay the price of that change. There is no escape.

Systems do not need to be coupled. It is mostly our naiveness and inexperience that causes a system to be a tangled, hard-to-change mess. Keep an eye on what is coupled to what and have measures in place to remedy or mitigate the impact of a change.

Avoid Over-Engineering

Over-engineering is the habit of solving a problem with more engineering than necessary to solve it. In other words, when a simpler approach would have been possible to solve a problem but instead a more complex way is preferred.

This is particularly true of code generation tools. There is a lot of value in generating boilerplate code, dont get me wrong. But this is usually what an IDE is for. In my experience, there have been very few times in which I have felt the burden of writing a class by hand. True, it would have been nice to be able to autogenerate code from a spec or something at that moment, but that feeling quickly evaporates when I remember that autogenerated code is often opinionated, ugly and outdated.

As a side note, I think that abstraction over code generation is a better approach to solving the problem of writing less code. Metaprogramming is an excellent approach for these kinds of problems. Although one can argue that writing code that leverages metaprogramming is inherently complex, the result ends up being a simpler API surface. So yeah, it is hard to do, but easy to use! I would say that is exactly the goal. Code generation tooling is hard to write, and not as easy to use (both the tool to generate the code and the generated code itself).

But, there is a world of difference in considering to autogenerate code for a massive API than when you have to do it for a small one. I still remember the frustration I experienced when I was forced to set up autogeneration for all the models of an integration I was building, even when many of them had two fields at most! It took me more time to set up all the autogeneration fluff than it would have taken me to write the models by hand. But well, the policy in that project was to do it that way.

When there is a simpler way of achieving the same result, go that route. Dont try to get clever just for the sake of it. Yeah, sounds pretty cool to write your routing library for this project, but, do you need to do that?

Testing is another source of over-engineering. People build these stateful mock servers because apparently, they need to test third-party APIs using the HTTP protocol as if someone would not have tested that already. In memory, mocks are much simpler and dont cause you the pain of setting up extra dependencies every time you want to test your application.

Simplicity is a rare jewel these days, not just because it is hard to find, but because sometimes it disguises itself as its shallow cousin: convenience. But simplicity tends to be found there where pragmatism trumps dogmatism. Make sure you are focused on practical stuff when solving a problem, and no over theorising it and trying to justify complex approaches based on remote possibilities.

Avoid Centralising

Organisations and systems evolve and grow more and more complex. This is natural and expected. What is not natural or expected is that, sometimes, we want to tame that growing complexity (that is, make it easier to manage) by using some form of centralised solution.

The problem with centralised solutions, especially in distributed systems or companies, is that by claiming a global benefit, they cause specific harm. Centralised solutions remove autonomy and create friction. They operate by removing the control of something from the subsystem or department where its function is defined, takes place and evolves, toward some central place that blurs the particulars. This dramatically impacts the further development of the thing to which control has been surrendered.

Take software documentation, for instance. Companies that choose to centralise documentation in a single place (a company Wiki or something else) usually do so by building a narrative of the benefit of having all knowledge in a central place of convenient access but at the cost of affecting the writing of the documentation itself. Oftentimes, due to the distance between the owner of the documentation (the code repository) and the central Wiki, these former ends up being completely outdated and seldom used.

There are other ways in which centralising is extremely dangerous. The central part becomes critical and a potential single point of failure or trouble. This happens with a central database used by many microservices services, or a central service where every business activity has to go check something before anything else happens (think of centralised access control platforms).

I think that as long as this central thing is disposable or invisible, then it is a good central thing. In other words, when the central thing is not the source of truth because the truth originated somewhere else. Think about Git, for instance. It has a central repository that everyone uses to coordinate, but the changes and the work happen in your local machine and then you push them, causing at least one person to always have an up-to-date copy of what is in the central repository. Git is a distributed system by nature and design, not a centralized one.

Same thing with documentation. If we need a place to solve the problem of information visibility, then let us solve only that problem. What prevents us from having an automated process push the documentation of a repository to this central wiki, instead of writing it in a separate place? The problem is not the writing of the documentation, so why that has to change? The real problem is its visibility of it: lets make it visible, without removing ownership (More about this in the next topic).

When centralising something, make it work as a repository. Not as the source of truth (this is, change), but as the place where all the changes made elsewhere go. Every technology that has any sort of centralised repository works this way: docker, composer, npm, git. Changes happen next to their source, and then they are published for the rest of the world.

If you make the repository the place where things change, then you are going to have a hard time keeping it in sync with the actual thing that changes. Synching is an added pain that you dont need to bear. It is incidental complexity.

Avoid Solving Pseudo-Problems

I call pseudo-problems problems that are not clearly stated. You see, most of what people call problems are just a preferred approach over small friction that usually touches the real problem but does not address it completely.

Take this centralised wiki as an example. If the problem is access to the information, then solve just that problem. If you force everyone to write documentation far from the system that documents, its going to cause another problem.

Problems need to be clearly stated before solved. Otherwise, you might be solving the wrong problem, or even worse, apparently solving the problem but causing others in the process.

This is too familiar a story, and it is such a painful one. An organisation has a monolith that is coupled, hard to change, buggy, and hard to make sense of, all the possible bad things that can happen with a monolithic system. Then someone comes and says We need to split this into microservices, and then they proceed to break the components of such monolith into different REST APIs. So, say that it was an e-commerce application. Now we have an orders service, a customer service, a fulfilment service, a payment service, and so on.

Nice! Every microservice responsibility is now clearly delineated. I would say that is a win, right?

Well, if you think about it, maybe not. What was the initial problem? Were the responsibilities of the monolith not clearly defined? Maybe. Was it the main problem? I dont think so.

When software is hard to change, it is usually due to coupling. Back to point one here. Did they solve the coupling? I dont think so. Now services are coupled via a network protocol (and not an in-memory routine), changes are less transparent (different teams work on different services now) and changing one service would impact the other in ways that are harder to spot (now we rely more upon logging). I would argue that this left them in a worse place, as managing changes across teams and distributed deployable units is harder than in a traditional in-memory application.

The solution should have been refactoring the initial application slowly to take care of the technical debt that made it hard to change and reason about.

This happens way too often and with many things. People throw tech at problems, not engineering. As a result, they think they solved a problem when what they did is just go around it and create some more in the process.

The only way Ive found to fight this insanity is to refuse to solve a problem until it is clearly stated and demonstrated it is a problem and why. Only then a reasonable discussion about potential solutions and their benefits and drawbacks can happen. Solving the wrong problem is one of the worst sources of incidental complexity.

Focusing on the Outcome

Matías Navarro-Carter — Sat, 12 Mar 2022 11:30:49 GMT

When I was in school I struggled a lot with math (I still do, to be honest!). My results at school were so worrying that my parents hired a private tutor to teach me math. He had an awesome talent to explain complex things in a very clear way, and also taught me different methods to solve equations or other mathematical problems. I was able to see the effect of his teaching pretty quickly, as I improved a lot in the weekly take-home exercises. But when the day of the test came, all my hard work seemed to suddenly be meaningless. The professor had told the class that we needed to use the methods he taught us to solve the problems in the final exam. Otherwise, we would be discounted a lot of points even if we reached the correct result. I thought that was so not right.

Probably you had the same experience at school or uni. Or maybe you were blessed and had a teacher who focused more on the outcome rather than on the method. Maybe you were privileged to have a teacher that would show you many different ways of doing things, their pros and cons, and would empower you to choose what would fit better for you.

You would be surprised to know (or maybe not so much) that stories like this repeat a lot in the workplace today, especially in tech companies. Maybe you have seen it for yourself. Maybe the task is to build a REST API, but you need to use this particular database engine, or this particular framework or library. Or maybe you need to build a payment microservice, but you must not use any form of async communication. Or you should only use docker for development. Or you need to test your code, but using this tool that the company built specifically for that purpose.

Method Fixation

Why this fixation with the method? Isnt that as unjust that this math teacher forced you to use his particular approach? Why do we constrain people that have an extremely developed capacity to learn and assimilate stuff (Software Developers) to a repeated formula or method? Why not, instead, focus on the desired outcome that we want, instead of telling people how to do stuff?

Some people say that it is for the sake of maintainability. By having One Right Way of Doing Something we ensure that we can move people around and replace them if we need to. But I would say that is an illusion. Sounds plausible on paper, but it does not work in real life. There is always a cost of moving people from one project to another, even when those projects use the same stack or underlying framework. They need to learn the ropes anyway, and they might find it hard to become productive quickly.

When I talk about this with some of my colleagues, they freak out. They think Im advocating for some sort of wild-west IT company where everyone does whatever they want. That could not be more wrong, let me explain.

Im advocating for a place where people do what are they told to do and the expected outcomes are but have the freedom to choose how they want to get there. Of course, that freedom is limited by some general constraints. For instance, I wont have a guy writing an API on Scala, because we have no one at our company that knows it. But, we use PHP, Go and Node quite a lot. I would let them choose and have them explain to me why she would choose that language over the other, just to see how they reason. Framework? You pick!

Instead of putting the focus on the process, we should focus on the outcome. I wont tell her how to do it, but what I expect the result to be. So, if I ask a developer to write me a REST API, this is my personal desired outcome:

Test suite with over 90% coverage
Stubbed or mocked unmanaged dependencies directly in code
Proper CI Pipeline, testing with even future versions of the language runtime
A readme that explains what the API does and how
Consistent error handling
Excellent Developer Experience (preferably, I should be able to clone the project, run one command and be ready to work and run the test suite)
Open API specification that is reliable

Note how these things dont say how you have to build the REST API, but what it should have. The schema is my favourite example. So, should you start with the schema and generate the code? Great, do it that way. Should you write the code and then generate the schema out of it? Works too! As long as the schema accurately represents how the endpoints work, you have my full support!

We can bring this to testing too. Would you write unit tests or full E2E tests? Again, whatever you like, as long as you know the tradeoffs. You can achieve high coverage quickly with E2E tests, especially in large codebases, but tests tend to be more brittle. Unit tests are slower to write but more resistant to change.

Again, the most important thing is that the Engineer chooses her path to a solution. Of course, inexperienced Engineers would need an overview of the possible paths, but even there I would encourage them to explore extra solutions.

The Technification of Tech

This topic is super important to me and a subject I can get very passionate about. And the reason why is that I love Software Development so much, that I refuse to make it a factory profession. I refuse to ask engineers to copy and paste a solution or a template, fill in the blanks and move on to the next thing. How am I doing them a favour? How am I helping them to grow, develop, try new things, and be challenged? How am I fostering innovation, and thinking outside the box? Did we put candidates through an interview process focused on evaluating their problem-solving skills and analytical capacity to have them only repeat a method? How am I empowering them to make their own mistakes and gain experience from them?

Imagine chefs cooking different dishes using the same recipe, artists painting with the same technique, or musicians using the same chord progression and instruments. Restricting any activity that requires creativity is almost a crime. It asphyxiates development in whatever area or field we are working on.

The problem is that in todays highly producer-consumer-based society, there is no time to be creative. There is no time for the appreciation of work well done in engineering. Templated, quick solutions are the norm. We are battling the technification of technology itself. The latest developments and pushes for low-code, no-code platforms speak of such a reality.

It is true. These tools provide massive gains in productivity (if they are flexible enough to support your use case). The problem is that we are raising a generation of Engineers that have just the required high-level notions to use some tool. In such a world, the best Engineers are the ones who go deeper in understanding the low-level notions that power the tools they use. And I want to both further the development of that kind of Engineer and work alongside them.

Empowering People

When people, especially creative people, are coerced into an automatic way of doing their job, we are doing them a massive damage. We are effectively lowering their value. We are telling them that we believe them incapable of coming up with something better and more efficient. Highly creative people will soon leave roles like that. You will be stuck with people who value comfort more than the creative development of their trade.

But then we treat our companies as mere assembly lines of a factory and equate our workers to machine controllers. It is no secret that factories play a massive role in alienating the value of individual work. Everyone can move a handle with some degree of coordination. Not everyone can make a machine that does amazing things just by moving a handle with some degree of coordination.

Engineers are not resources. They are people in a highly creative trade. They are not easily replaceable or interchangeable. They will stay in a place that fosters their creativity and leaves them room to do so. This goes further than just getting them mere access to a learning platform. They need an outlet for their creativity, they need a blank canvas on which they can try new things and approaches. They need autonomy, and people in higher places trust them to do a good job. They need to know the whys, not the hows.

This freedom comes with a great responsibility of course. Now, all of a sudden, an Engineers decision can have a bigger impact on a project or company. And depending on the quality of that decision, the impact can be good or bad. There are two answers to this. (1) If you empower your engineers, they will feel the higher burden of the responsibility they have, and therefore look to do a better job researching well and providing the best implementation they can. (2) They will make mistakes of course, and bad decisions. We all do, so a good review process whose focus is on guidance is also a good alternative.

If after those two things, still, some bad outcomes slip through, a call to remember reality is necessary. We work with people, not gods. We will make mistakes. Make sure to keep empowering the people that, after they do, review their mistakes and are determined to not make them again.

The Golden Rule of Writing Code

Matías Navarro-Carter — Fri, 26 Nov 2021 11:34:27 GMT

Every now and then I read a piece about DRY vs WET, explaining their benefits over its counterpart. Most of the time, these pieces are quite unhelpful, as they are way too vague and general. They lack a crucial thing, and that is context.

I believe this is the most common aspect on disagreements about any topic in software development. Someone writes a piece about X or Y topic arguing a point, but then another piece comes up arguing a different case (or sometimes totally the opposite one). Almost always both authors are presenting valuable and correct ideas, but their conclusions are different. Why?

I believe this is due to a lack of context, and this happens a lot in the DRY and WET discussions.

For instance, I read this article that poses yet-another paradigm for dealing with the problem of when to abstract.

You should absolutely read it. Basically, the author summarized DRY, its benefits and problems. Then it summarizes WETs benefits and problems. And then goes on to present its own acronymic alternative: AHA.

So, this is the state of the art now in the debate (Im intentionally reducing the authors ideas to a few words here):

Someone proposed DRY: We should avoid repeated or duplicated code by abstracting it out.
Someone proposed WET in reaction: We should wait until the duplication is real. People abstract too early.
Someone proposed AHA in response: Even when there is real duplication, you dont know the future. We should wait for the right abstraction.

Thats it, in a nutshell. But all this is not helpful at all. There are a lot of questions left unanswered. Does DRY means I can have no duplicated code at all? Is all duplication bad? When is it too early to abstract? How can I know when is the right time to abstract something away?

Articles dont answer that question, because they cant. And they cant because the answer is it depends. Depends of your own particular codebase and problem, and the context surrounding it. Your problem cannot be reduced to an easy-to-remember acronym created by a guy who is not dealing with your particular issue.

Proof of this is this wild declaration made by the author of the AHA article. It says this when it summarizes his approach:

I think the big takeaway about AHA Programming is that you shouldnt be dogmatic about when you start writing abstractions but instead write the abstraction when it feels right and dont be afraid to duplicate code until you get there.

That is some terrible advice. The points he raised in his article were all valid, but the conclusion is just plain terrible. In an attempt to escape from dogmatism (which is always good), the solution to the problem ends up being reduced to a mere feeling.

When does it feel right to code an abstraction? I dont know, even though in some of my projects I could have an idea. But most certainly, someone will feel different.

I think we need a bit more objective advice on how to do this.

Going Back to the Roots

DRY as an acronym was coined with a purpose. A purpose most people seem to ignore. Why DRY is good? Because if you have to correct or fix duplicated code in the future, it will be harder to do, because you will have to change it in the multiple places it was copied. Why is WET or AHA good? Because if you choose the wrong abstraction and then need to change it, it will be harder to do so in the future.

For me, there is just only one rule to follow when writing code. Is not an easy one: most of the time youll have to figure out the answer and sometimes youll get it wrong. The application of this rule can take many forms and faces, and so it will require you to be wise. Here it is:

Should requirements change in way X, could this code Y be easily changed?

Thats it. There you have it. That is all you need to know for starters. From that rule, every software principle, every design pattern, every acronym flows.

Of course, like every summarized thing, this rule needs explaining. Let me break it down in pieces.

Should requirements change

This is kinda obvious. Code changes all the time because of requirements. Code is never finished because of requirements. If you have ever done client work you should know this very well.

In every software project where changes happen, you must be prepared for them. This requires you to anticipate. Anticipation is one of the most important qualities of a seasoned developer.

Im currently learning how to drive and my instructor keeps telling me Im missing a very important skill: anticipation. I need to look ahead and be prepared for possible things that may occur. I need to look at the signs that could cause me to stop or do another manuever, all the time. He keeps telling me what he has come to call the golden rule of driving: we drive not for what is happening, but for what could happen. Im just borrowing his wisdom here.

Coding is not much different in that aspect. When coding, we are not coding just for what we are asked, but what we could be asked. We need to be aware and read the signs, and that just takes experience.

If you are in a project in which requirements dont change much or where you are in full control of them, then none of this applies to you.

in ways X

If you read the rule well, youll notice there are two variables: X and Y. This is the first one of them.

I represented the change using a variable because we do not know in which ways the requirements will change. This seems to contradict my previous point, but it does not. Let me explain.

One thing is to try to anticipate, another very different one is to know in advance. Some people confuse the two of them and they say: Since I cannot know in advance, it is pointless to anticipate. They couldnt be more wrong.

Only God knows in advance, we agree on that. The purpose of anticipation is not knowing, is preparing. We read signs of things that might happen so we can be prepared if they happen. We dont know if they will, but the key verb here is to be prepared.

You dont know if you are going to lose your job, but nonetheless you try to be prepared for that by saving up every month. Not knowing in advance does not rule out being prepared for something unexpected.

Now, here is when you are pretty much on your own, and when context is key. Only you know in which ways requirements may change. It could be that they ask you to use another database engine, or support multiple methods of authentication, or be able to configure certain parts on the system on demand. It could be anything, but you must always be looking out for the signs.

Usually, when requirements are confusing since the beginning of a project, thats a very good indication for me when things might change.

could this code Y

So, the second variable subject here is not only the changing requirements, but the code you are working on that could possibly be impacted by that change of requirements.

For instance, if a requirement is to convert files to PDF, that is a very good indication that whatever code you are using to do that, should be properly isolated so it is easy to change. PDF conversion tools abound out there, and it might happen that youll need to change one for another.

For most if these kinds of problems you can get away by coding to an interface. Define an interface in your application and implement it. The interface should be small and generic: pass just the enough arguments required to do the conversion.

Then, have your code use the interface. You can be sure it will be easy to change in the future.

Again, this is a very particular example, but it has a very wide application. Interfaces are the best way to protect your code from changes. If you design them well, you will have an easy time swapping the implementation for something else.

Bottom line, you and you only know your code. Be wise.

be easily changed

Now, if the rule were only Should requirements change in way X, could this code Y be changed that would be a terrible rule. The word easily is key.

Maybe you come to the conclusion that a potential requirement X could make your code Y change, so you want to refactor. But the impact is not that big so leaving it unchanged would still make it easy to refactor should that requirement X become a reality in the future. In that case, dont change it.

You should only refactor code when the potential of a requirement change will make it hard to change. Sometimes, changes are not that hard. If a piece of code is duplicated in two places, that is not that hard to change.

Remember, the goal is that we want code that is not hard to change.

On Picking the Wrong Abstraction

Sometimes, people talk about picking the wrong abstraction early as the worst mistake you can possibly do.

Ive done this so many times, and it is not that terrible at all and quite simple to correct. I have many stories about this.

I once coded an interface for money conversion in a project I was doing. It was very simple. It was called MoneyConverter and has one method Convert. It took a Money instance and a currency as arguments and returned a new Money instance with the converted amount.

I implemented that using an external api, but that does not matter. This is how the interface looked:

interface MoneyConverter{    public function convert(Money $money, Currency $currency): Money;}

My client code just used that. Never knew anything about an api or any other implementation detail.

It was a good idea until someone said We need to display to the user the conversion rate in which the conversion was made. Too bad for me, my interface did not returned that information: it just returned the converted amount. So, I modified the interfaces return argument to be another object called Conversion that was holding two things inside: the Money class and the rate.

interface MoneyConverter{    public function convert(Money $money, Currency $currency): Conversion;}class Conversion{    // Original properties were not public. This is just to save space.    public Money $money;    public float $rate;}

Now, I had to update every part of the code where the MoneyConverter interface was used (around two or three places) so they were handling correctly the returned value and then I had to correct the implementation.

Now, two questions.

First one is, could I have anticipated to the requirement? Absolutely! It seemed something very reasonable to ask and expect. If I ever write a conversion service again it will always expose this information by default.

Second is, was it a hard change? For me in this project, it wasnt. And I think here it is where all the fears of coming up with the wrong abstraction come from. It was very easy for me to spot all the places where I had to change this because I had all the code using the interface covered with tests, as well as the implementation using the API. As soon as I changed the interface and the implementation, my tests exploded, showing me exactly where and why were failing. Either myself or any other developer could have made the change, because of the robust test suite.

The reason why maybe coming up with the wrong abstraction is such a fear, is because people dont write tests. And tests are essential into making a piece of software easy to change. Tests are not so much for ensuring correctness of a program: they exist to help you confidently change your code.

You will make wrong decisions when writing code, and sometimes that will lead to disruptive changes. Be prepared for that with a good test suite. Again, its all about being prepared.

Conclusion

Much of the coding best practices and design patterns out there have the sole goal to make your code easy to maintain by ensuring it is easy to change when it needs changing. This is the most important aspect you must keep in mind when writing code, more than how many times you copy code or if you feel it is right to abstract something or not.

Will this be easy to change is the most important question you can ask yourself about your code, and only wisdom, experience and good judgement can help you answer it. Making sure it is easy to change by coding well (DRY, WET, AHA, Design Patterns) and writing tests is crucial.

Testing HTTP Code in Golang

Matías Navarro-Carter — Mon, 09 Aug 2021 10:37:10 GMT

The Philosophical Answer

If you think about it, it is not an easy question to answer. Maybe you have already some strong opinions formed about it. But, in my experience, answers to this question differ greatly among developers, even among seasoned ones.

I believe those differences are due to some preconceived ideas or different definitions of what testing is. For example, some people believe that testing is making sure your code works. The problem with that definition is that it is too vague; it works can mean anything.

Take, for instance, the following (too familiar) situation: you are told to code a service (or SDK) that integrates with a third party API or HTTP service. As is usual with integrations, confusing or incomplete specs are passed around. Nonetheless, that is sufficient to do your job. You decide to test creating a mock server based on the spec and build your suite to a very good coverage. So far so good.

But, when the day comes to do some acceptance testing against the QA environment of the service you are integrating to, you realized nothing worked. Turns out services needed an extra header that was not included in any spec. No problem though; you add it to the code, update your tests and move forward.

Ill come back to this story to explore other relevant topics later. For now, I want to use it to ask you a question. Would you say that the initial version released to test against QA worked? Well, its a tricky one, isnt it? It did not worked in the sense that it did not integrate correctly because of the missing header. It worked in the sense that the program did what it was coded to do with the available knowledge at the time.

Based on this, I would like to make the main point of this article, from which every other point flows. Testing is not making sure your program does something correctly; testing is making sure your program does what the code says it does. In our previously mentioned story, we cannot ensure a correct integration until we have hit a real service (and not a mock), but we can have good tests that ensure the program is doing what we have coded it to do.

I think the distinction of these two is greatly accentuated in integrations with third party HTTP services. Unless you have an spec that is automatically generated from the service code, until you start hitting endpoints, you can never know for sure if you have integrated correctly or not.

In my opinion, the sooner we embrace this reality, the better. Once we do, we will be able to go and ask ourselves the next question.

What then do we test?

So, if we cannot test that our SDK integrates correctly with the service. What do we test then? The answer is: we test that our code follows what we understand of the specification we were given.

For instance, if the specification says that we should send an Authorization header with some sort of token, we test that (1) A request is created containing the Authorization header and (2) that the passed token value is indeed the same that is injected in the header. Similar principles follow for URL, method and body.

The following of a specification does not have only to do with the expectations about a request, but also with the correct handling of a response. This means we should also test that our code follows the specification when handling responses.

We should map status codes to certain errors, or react to different content types, or deserialise certain payloads to some types without data loss, etc. We should test that our code does this based on the spec.

Okay, Im sure you are getting very impatient and want to get to the how of testing a third party http integration. Just allow me to say one more thing.

What are we not testing?

Many developers understand all that Ive written. So, they take their keyboards and decide that the best way to test the aforementioned things is just by spinning up a temporary web server process, listening in a random port, that is pre-configured to respond to requests mapping certain methods and URLS to certain responses. Most people call this a mock server.

No blame on them! Ive seen this approach being endorsed by really prominent Go developers. And I think it is specially prominent in Go due to the fact that it is indeed very easy to spin up a server in a separate Goroutine.

However, this approach is often unnecessary and overly complex. Let me explain why.

We are not testing TCP/TLS/HTTP!

First, there is no need to send our *http.Request over a TCP socket to a server, have the server parse the request and end up with a *http.Request again in a completely different process, that then will be passed to a handler that will match our request and return a response.

http.Request --> Http Client --> TCP Socket --> Server --> Http Parser --> http.Request --> handler -> http.Response

We can simplify this massively, bypassing all the TCP, server stuff and just doing things in memory, for instance, in a function.

http.Request -> function -> http.Response

And this is fine, because we are not testing TCP, nor TLS, nor the HTTP protocol. The Go standard library already has tests for all those packages and functions. We want to test our code.

In order for us to test that our code complies to a spec, there is no need then to spin up a web server.

Plus, if something happens with that server, it will be really hard to debug.

We are not testing routing!

Even when not using server-over-TCP mocking techniques, but in-memory ones, some people still go with building some kind of in-memory testing server that returns responses based on some matching logic. Usually this takes the form of matching the method and the url.

Again, this is completely unnecessary, and it could lead to undesirable side-effects in testing, plus a couple of more issues.

It is unnecessary because, remember, we are testing that our code conforms to a spec. In other words, we are testing that we send a request with the correct contents and that we are capable to handle certain responses. We are not testing routing (that a request with a certain method and URL with gives us a certain response).

This approach usually leads to side effects. Since this massive, respond-to-everything, in-memory mock of a server needs to be configured somewhere, it usually is outside the tested code. If someone changes an id, or accidentally creates another request with the same url there is potential breakage.

Also, there is no clear contract regarding to what should be the response when a request of this mock cannot be matched. This usually weakens error handling code.

Finally, a mock like this ignores the fact that some HTTP operations are not idempotent: the same method and url combination can and will give different answers based on the internal state of the server at the time of the call. It is really hard to mock that using this approach.

Its better not to try to play any matching games and do something deterministic and straightforward.

The Practical Answer

Now that I have ranted enough about these things, is time I explain my proposed approach.

Lets suppose that we have an third party service with an endpoint POST /input. This endpoint takes a application/json payload that only contains one key message, and can be an string of any length.

The service returns an application/json payload with the same structure: again, the object with a message key.

This is how I would implement it in Go. Read the comments so you get a better understanding.

package fakesdk// A main client struct to hold everything togethertype FakeApiClient struct {    client  HTTPClient    baseUrl string}// A constructor to make that client with good defaultsfunc NewFakeApiClient(baseUrl string) *FakeApiClient {    return &FakeApiClient{        client: http.DefaultClient,        baseUrl: baseUrl    }}// Some people like to make an interface with the same signature// as http.Client.Do function so they can swap implementations for// testing. http.RoundTripper can do this already, but well,// everyone has their own preference.type HTTPClient interface {    Do(req *http.Request) (*http.Response, error)}// This creates the request. Pretty standard stuff here.// The only detail is that we need to serialize from json and make// sure we put the right content type.// Oh, and that we pass the context to the request!func (cl *FakeApiClient) mustMakeRequest(ctx context.Context, method, path string, input interface{}) *http.Request {    var body io.Reader    if input != nil {        b, err := json.Marshal(input)        if err != nil {            panic(err) // Developer error        }        body = bytes.NewBuffer(b)    }    url := cl.baseUrl + path    req, err := http.NewRequest(method, url, body)    if err != nil {        panic(err) // Developer error    }    req.Header.Add("Content-Type", "application/json")    return req.WithContext(ctx)}// The fake input structtype FakeInput struct {    Message string `json:"message"`}// The fake output structtype FakeOutput struct {    Message string `json:"message"`}// This is the actual method that will be used in client code.// Pretty standard stuff too. Sends the request and handles any error.// Also decodes the payload.func (cl *FakeApiClient) PostInput(ctx context.Context, input *FakeInput) (*FakeOutput, error) {    req := cl.mustMakeRequest(ctx, "POST", "/input", input)    res, err := cl.client.Do(req)    if err != nil {        return nil, err    }    defer res.Body.Close()    if res.StatusCode >= 400 {        return nil, errors.New("server responded with code %d", res.StatusCode)    }    out := &FakeOutput{}    err = json.NewDecoder(res.Body).Decode(out)    if err != nil {        return nil, err    }    return out, nil}

Now, the only thing I need to test is that I send the correct request and Im capable to handle all possible responses or eventual socket errors. Thats it. Nothing else.

Sending the the correct request in this case means that the method is correct, the URL too, that the body gets serialized to JSON correctly and that the compulsory headers are present and with the correct values.

Being capable to handle all possible responses means that I should code expectations for when my code fails. For instance, if I get a response with a status code 400, then my code should return an error saying server responded with code 400.

Now, doing all these checks on the request and building all the responses for every test case would be very verbose. Luckily, Ive created a package just for that. It is called httpclientmock. It is extremely simple and straight forward, and you are meant to use it in your test suites like this:

package fakesdk_testvar postInputTests = []struct {    name       string    input      *FakeInput    mock       *httpclientmock.Mock    assertions func(t *testing.T, output *FakeOutput, err error)}{    {        name:  "test one",        input: &FakeInput{"This is a message sent"},        mock: &httpclientmock.Mock{            Expect: &httpclientmock.Request{                Method: "POST",                Url:    "",                Headers: map[string]string{                    "Content-Type": "application/json",                },                Body: []byte(`{"message":"This is a message sent"}`),            },            Return: &httpclientmock.Response{                StatusCode: 200,                Headers: map[string]string{                    "Content-Type": "application/json",                },                Body: []byte(`{"message":"This is a message received"}`),            },        },        assertions: func(t *testing.T, output *FakeOutput, err error) {            if output == nil {                t.Error("no output")            }            if err != nil {                t.Error("an error has happened")            }        },    },}func TestPostInput(t *testing.T) {    client := &FakeApiClient{http.DefaultClient, ""}    for _, test := range postInputTests {        t.Run(test.name, func(t *testing.T) {            // Inject in client mutates http.DefaultClient transport.            // The restore function restores the previous transport.            restore := test.mock.InjectInClient(t, nil)            // We defer the restoring of the previous transport when the test finishes            defer restore()            // Pass the input            out, err := client.PostInput(context.Background(), test.input)            // Assert about the output            test.assertions(t, out, err)        })    }}

The benefits of using this library are huge. First, its ability to modify http.DefaultClient responsibly means you dont need to worry about dependency injection too much when setting up tests that send requests very deep in the call stack. So, you could use it for E2E tests without a problem.

If you wish to use better practices like dependency injection, no problem, we got you covered. httptestmock.Mock has a method called BuildNewClient that will give you a *http.Client. You can also call GetTestFunc and this will give you a TestHttpFunc, which is a type that implements http.RoundTripper and another function that has the same signature than Do in http.Client. You can integrate this library into your code in all these ways.

You are probably thinking Oh this thing modifies the global http.DefaultClient. That could cause massive side effects And yes, you are correct. This is why InjectInClient returns a function. Calling it will restore the state of the client to what it was before the test. And you must make sure to defer that, so no other tests can potentially be affected by the mutation.

You can keep on adding more tests in the block, with different payloads and different responses, writing expectations for every case. All the information of the test is in the test itself. No need to chase other files or look in logs from another process.

Also, no side effects. All the state of the world lives there in your test run. Your response will be what the Return property indicates will be. No surprises. Thats how a test should be.

Learn By Looking

If you need a more comprehensive example. You can take a look at this library Im building. Its an SDK for a third party HTTP service from Chile called Transbank. One of its services, Webpay, allows you to integrate with their payment gateway. Im using httpclientmock to test the integration.

Going Solo

Of course, you dont need to use a custom library for testing. As long as you can create your own http.RoundTripper (with some assertions about a request and the building of a response) and pass that to the http.Client you are using, youll be fine.

The Business Answer

Now, all good so far? Well, not so fast! As you know, we are in the real world trying to bring value to our business partners. And they dont care as much about in-memory deterministic testing or testing theory or stuff like that. They want to know the answer to one single and simple question: will this integration work for the end user.

User Acceptance Testing (or UAT) tries to answer that question, and it is a freaking art. This is one of the most difficult tests to automate. We cannot use the tests suites we wrote because we are testing conformity to a spec, remember? Those tests cannot tell us if the integration will work, which is the purpose of UAT.

UAT tests deserve a test suite of their own, usually excluded from running in the normal CI process. They should be run prior to a deployment (usually against a release candidate version) against a real testing/staging environment. Is in this test scenario when it makes sense to hit a real server.

Now, it is not as simple as write some logic and hit the server. It is more complex than that.

You see, now we are dealing with all kind of side effects and statefulness. If we want to test an endpoint called GET /accounts/123 we need to make sure that the account with id 123 exists in the system before doing that, so we probably need to call POST /accounts first with some payload. In almost every UAT test there are some calls that depend on state obtained by other calls. For this reason, UAT test suites usually involve a journey. They test a complete flow, from start to finish. They are not isolated tests like your normal Unit Tests. Youll find yourself doing calls, retrieving state and then using that state to do some more calls that will continue to change the state of a server.

Of course, I dont have to mention that this could fail at any point of the journey for some weird reason.

Another thing Ive found to be really complex is that environment statefulness makes it really hard to reuse any kind of identifier, because you want to make sure that the test starts with a clean slate every time and you dont know if the third party service testing environment you are using cleans up every day.

Due to their messiness and complexity of implementing them automatically, UAT testing is usually done manually, but this is very prone to errors. At the end of the day, all depends on how complex is the integration you are building. You are in a lucky position if most of the calls of your third party service are idempotent, for example.

Conclusion

These are just some of my ideas, beliefs and past experiences with testing. I hope they can make some sense for you and that will guide you to take the best approach you can to test your integrations in Golang.