Of course monolith infrastructure is cheaper than serverless

In recent years, teams have been buzzing about micro services, with many organisations jumping on the bandwagon. Even the US Air Force now runs its latest fighter jets on k8s. However, just like Agile, SCRUM, or the ‘latest’ software development methodology, success isn't guaranteed. What we are seeing is a realisation that the complexity of managing these micro services has a cost. A cost that is not always beneficial unless running at a larger, more complex scale or team topology. This is why some teams are now making a reversal, returning to the monolithic architecture they once left behind.

The problem…

Splitting applications up with APIs gives us a defined separation of responsibility. It's hard for 100+ people to cooperate together to build a monolith application. But if you had 10+ teams of 10 people deploying their own micro services it's easier to decouple and to deliver at the pace each team needs.

Getting everyone to agree on what these individual services should look like is also where problems arise. Do you assign a team to a single function, or is it based on business unit requirements? For APIs, who is deciding the definitions and are they being documented? Conway's Law states that the design of a system mirrors the structure of the organisation responsible for creating it. While micro services can offer better separation between teams, this advantage may not always be realised due to the inherent team structure or even culture of an organisation. In such a situation, monolithic architecture may start to look more attractive.

So it's not surprising to anyone that articles like Amazon Prime Video’s "Micro services to monoliths" will emerge from time to time. However, in this example, they needed to handle multiple state transitions per second as part of video streaming data. That's really not a great match for serverless and led to some impressive cost savings for Amazon. However some question remain:

Isn’t this something that should have been made apparent in the upfront design?
Was this an example of jumping in headfirst and developing without thinking through the problem? Analysis paralysis is a real thing organisations face but could they of over-corrected on that a bit.
Or would you argue employing serverless technology for rapid product testing was a smart initial move? There’s value in just getting things out the door and iterating, but it seemed that could be happening with less and less foresight, leading to bigger issues and larger refactors/iterations.

In this example, the issue arose when they failed to recognise the expenses associated with step transitions, which then led to the subsequent optimisation step of transitioning from a step function to a single EC2 component.

With that being said, the original post from Prime Video Tech contains numerous gaps, leading to confusion and a seemingly inaccurate title of "From distributed microservices to a monolith application". The process appears to be more of a refactoring rather than a complete transformation.

Deciphering the Monolith Puzzle

So where does that leave us? Choosing the right architecture for your organisation is a balancing act. It's possible to maintain the separation of concerns and scale different APIs using a monolithic architecture while still enjoying the benefits of micro services.

To decide whether you need to move back to a monolithic architecture or fix issues in a distributed monolith consider:

The trade-offs and timeframe.
Analyse the pain and productivity loss from micro services over time and weigh it against the cost of migrating to a monolith. Taking into account factors like Conway’s law, team size/ topology, experience, and expertise.

Team Topologies - by Matthew Skelton and Manuel Pais does an excellent job at providing a framework (grounded in Conway's Law) for structuring teams to meet the needs of users and align with the architecture of the systems you're building.

Overmind is a SaaS Terraform impact analysis tool. It discovers your AWS infrastructure so that it can calculate the blast radius of an application change, including resources managed outside of Terraform. Helping you to identify the causes of outages by showing you which changes caused which problems. While also helping you to deploy changes faster by calcuating the changes blast radius and providing a list of human-readable risks within the app or as part of your CI / CD pipeline. From this report you can understand if the change can be confidently made, or held back if it’s too risky, preventing outages in the first place.

Check out the example Terraform example repo here.
Get started with Overmind for free here.
Or join our Discord to take part in the next wave of Devops tools.

Of course monolith infrastructure is cheaper than serverless

The problem…

Deciphering the Monolith Puzzle

Prevent Outages from Config Changes

Stop Evaluating AI Tools Based on Demos. Use This Framework Instead

Protect your critical AWS infrastructure with intelligent auto tagging

Infrastructure dependencies are more dangerous than your code dependencies

Has AI Code Generation Made Reviews the New Bottleneck?

Of course monolith infrastructure is cheaper than serverless

The problem…

Deciphering the Monolith Puzzle

Prevent Outages from Config Changes

Latest blogs

Stop Evaluating AI Tools Based on Demos. Use This Framework Instead

Protect your critical AWS infrastructure with intelligent auto tagging

Infrastructure dependencies are more dangerous than your code dependencies

Has AI Code Generation Made Reviews the New Bottleneck?