It's not microservice or monolith; it's cognitive load you need to understand first
How understanding cognitive load and team capacity can help you decide on an architectural style
TLDR
“Instead of choosing between a monolithic architecture or a microservices architecture, design the software to fit the maximum team cognitive load”
If you have only one team, consider adjusting your architecture to match the team’s capacity. Favour monolithic, cohesive and modular architectures.
If you have multiple teams, consider doing microservices or similar type of architectures so they can work independently.
Not all cognitive loads are created equal. Different types of cognitive load will affect the amount of quality outcomes that teams can deliver. Organizations should endeavour to mitigate or eliminate intrinsic and extraneous cognitive load so that teams have mostly germane cognitive load.
The types of communication boundaries change significantly between single and multiple team architectures. Single teams are optimized to communicate via the codebase, documentation, discussions and design meetings. Multiple teams are better optimized to communicate via well-designed APIs (or libraries) that abstract the complexities of their domains.
Don’t bite off more than you can chew
This post is inspired by this quote in the book Team Topologies which I am now reading for the third time:
“Instead of choosing between a monolithic architecture or a microservices architecture, design the software to fit the maximum team cognitive load”
I really love this quote because it provides an excellent rule of thumb as to what architecture you should decide, and it also comes with the wise caveat of not biting off more than you can chew.
So before you pick any type of architecture, what you need to ask yourself is how much cognitive load capacity your organization has available to meet the demands of the product you want to deliver and then adjust accordingly.
But how do you determine cognitive load capacity? And is all cognitive load created equal? Let’s try and answer that…
How to determine cognitive load capacity
Let’s look at the three types of cognitive load as defined by psychologist John Sweller and how understanding them can help us design software architecture that matches their level of cognitive capacity and assign the right amount of work:
1. Intrinsic cognitive load
This is how much expertise a team has on the domain they are tackling. If your team is tasked with creating a ML vision project but no one in it has any experience with AI, then their intrinsic cognitive load will be initially very high as they learn this domain. Ideally, teams should work with domains that mostly align with their expertise but also leave some room for growth to keep it exciting. A bit of intrinsic cognitive load is good because engineers love skilling up, but too much of it can cause significant delays and anxiety about delivering quality outcomes soon.
Single team
Even though a team can be cross-functional and have expertise across various disciplines, there is an upper limit of how much this can be stretched. If there is a significant gap in skills for the team to achieve what they need, consider enabling them by providing them with the necessary training and time to catch up.
If your project requires too much expertise across too many domains you run the risk of turning the team into a group of individuals that each specialise on a separate domains, this will likely diminish their quality of outcome by making them work independently, rather than as a team.
Multiple teams
Having multiple teams gives you a wider scope to be able to split the domains across a larger number of people and will also give you more choices on the scope of expertise available to you. As an organization you may want to consider strategies to reduce intrinsic cognitive load by assigning no more than one complex domain per team, and also providing training or having enabling teams that can teach other teams to get better in a particular area. For example an SRE team that helps other teams understanding SLOs and observability in general.
Recommendations
If you have a single team available to work on your vision and their expertise does not align with the problem you are trying to solve, consider enhancing the teams abilities by providing them with training.
Don’t assign too many domains for a single team, or risk them becoming a collection of individuals not working together and learning as a group.
Having multiple teams allows you to increase the scope of your software architecture with additional expertise. Align each domain with the right team for the task and enhance skillsets where needed with training or having enabling teams to train and assist their expertise.
2. Extraneous cognitive load
These are all the things that a team needs to do that does not contribute directly to the outcome of their work. Extraneous cognitive load can happen for multiple reasons, many of which may be inescapable or time consuming to fix, affecting the entire company. For example, a team that’s bogged down with tedious infrastructure tasks, tedious company processes, yak shaving, waiting for tickets to be fulfilled by another team, technical debt accrued company wide - all those things are extraneous cognitive load that don’t directly contribute to outcomes. Hence extraneous cognitive load is the least desirable of cognitive loads because it can create too much friction on the way to delivery.
In addition to extraneous cognitive load affecting the organization, the type of architecture can also, in some cases, add extraneous cognitive load:
Single team
Splitting a single team’s work into multiple microservices has the potential to add additional extraneous cognitive load in the form of costly communication boundaries that can fragment the team’s work.
Communicating via work is the best form of engineering communication, but this takes different forms whether communicating across teams or within the same team. The ideal communication boundaries for collaboration across teams are well-designed APIs (or libraries!). But this is not the case for communication within a team because all members will be familiar with the domain they are working on, hence abstractions within the team aren’t as much of a requirement and can create extraneous cognitive load. In this case it will be the clarity of your code, your documentation, your design and discussions that drive your communication.
Splitting the work of a single team into multiple microservices, each with their own API can have a significant overhead in the amount of work that a team has to do to keep these APIs well-designed, user friendly and documented, and it can fragment the team’s work across multiple repositories and unnecessary boundaries. In this case a modular approach in a single code base may be preferable to keep consistency and focus within the team.
Multiple teams
Splitting the architecture into microservices when having multiple teams frees them with more time to implement well-designed communication boundaries in the form of APIs, ceasing to be extraneous cognitive load. Because these APIs will be used by other teams, it has two main advantages:
It reduces the need for different teams to have to constantly talk to each other to get work done.
It enables other teams to self-service the abstractions of another team’s work. It mitigates or eliminates their extraneous cognitive load because they won’t need to be overly familiar with every domain, as long as they know how to use the API - which must be well-designed and intuitive.
In this case the API keeps the organization well-oiled as it reduces the cognitive load of other teams and unnecessary communication, helping every team get into the flow.
However if you split the domains across multiple teams and communication boundaries are not properly defined with elegant APIs or other self-service tooling, then you will quickly find yourself in a situation where every team is constantly communicating with everyone, trying to understand each other’s domain to do their job. This is far from ideal and will create a significant amount of extraneous cognitive load.
Recommendations
It is a joint engineering and leadership effort to ensure that all extraneous cognitive load is mitigated or ideally eliminated. This can be done by simplifying processes, reducing yak shaving, abstracting domains in the forms of APIs for other teams to use, designing architecture in a way that does not create redundant communication channels, etc. If reducing or eliminating extraneous cognitive load is not possible at the moment, it must be taken into consideration when assigning work to a team. Too much extraneous cognitive load limits the amount (and potentially the quality) of outcomes a team can deliver and can lead to burnout.
A single team working on multiple microservices can introduce additional extraneous work in the form of communication boundaries and fragmenting the work within the team. It may be preferable instead to work on a modular monolith that can later be split if necessary.
Your communication across teams should be done predominantly with self-service tooling, be it APIs, libraries, CLI or whatever works. Too much communication can cause friction and hurts flow when too many teams are talking to each other when they shouldn’t have to.
3. Germane cognitive load
Germane cognitive load comes from solving problems that relate to the domain the team is working on, it is what leads to flow state, general developer happiness and quality of outcomes. This cognitive load is mostly positive, provided that teams are not overloaded with too many objectives in too short a time. The more intrinsic and extraneous work you can mitigate, the more germane work your team can focus on.
Single team
If you are a one team company, consider narrowing down the ambitions of your design to match the cognitive load of the team.
Always take into consideration the amount of extraneous cognitive load they will need to deal with. Remember that assigning a complex microservice architecture to a single team will in itself add additional extraneous cognitive load, even if there is none imposed on the team by bad tooling, technical debt, or friction from corporate processes or silos.
It is better to narrow down your ambitions to what the team is able to deliver safely and adequately than overstretching them until they burn out or aren’t able to delivery features with quality. You can always grow your capacity as the need arises and teams can always make their monoliths modular so it’s easier to split them later, if you ever have to.
Multiple teams
If, on the other hand, you have multiple teams at your disposal, then you can consider splitting your architecture into microservices or something similar so that teams can work on their own domain independently. This is better than everyone working together in a monolith or a gigantic shared codebase where everyone is overloaded with continuous communication.
When multiple teams work together to deliver software, make sure your team topologies reflect your software architecture and favour well-designed APIs as a form of communication for teams. This makes it easier for a team to abstract the complexities of their domain and it reduces the amount of talking needed to deliver work by the organization as a whole.
Recommendations
Favour modular, monolithic architecture if you only have a single team. Design the software to fit the team’s capacity so they can deliver quality of outcomes. Always take intrinsic and extraneous cognitive load into consideration, as they will reduce the amount of germane cognitive load your team can handle.
If you have multiple teams and complex requirements, favour microservices or architectures that make it easy for teams to communicate and abstract their domain via well-designed APIs. Never assign more than a single complex domain to each team.
Conclusion
Just like an organization chart should never be designed without consulting technical teams to match the architecture designs, architectural styles should never be decided without understanding capacity and the types of cognitive load affecting engineering.
Once these are understood, leadership and engineering can decide on more effective architectural designs and can take steps to mitigate or eliminate extraneous cognitive load and help the intrinsic load.
There is room in this topic to explore cross team communication in more detail, but that deserves its own post some time later.
About me
Fernando Villalba has over a decade of miscellaneous IT experience. He started in IT support ("Have you tried turning it on and off?"), veered to become a SysAdmin ("Don't you dare turn it off") and later segued into DevOps type of roles ("Destroy and replace!"). He has been a consultant for various multi-billion dollar organizations helping them achieve their highest potential with their DevOps processes.
This seems related to Conway's law, where software structure tends to model the organizational structure of the people who constructed it.
Nice article! It reminds me the Team Topolgies talk in the DevOps Enterprise Summit in 2019: https://www.youtube.com/watch?v=haejb5rzKsM