There are a lot of modern design patterns for microservices-based applications. Two design patterns that I’m interested in right now, from the perspective of how they support application support, are the bulkhead and sidecar design patterns — let’s take a look.
(Looking for more developer resources? Check out these DevOps conferences.)
How you build your application absolutely impacts the lives of those in charge of supporting it. This isn’t a correlation we often make — but thinking about what happens when things break as you build your application will help everyone.
Developers should be thinking about ways they can improve incident management and response through code, especially because more and more developers are on-call. When considering ways in which your application assists in incident response and remediation, here are the attributes you’re looking for:
Now let’s turn to two patterns and show how they can support incident response and remediation.
The bulkhead pattern seeks to isolate applications and services into pools of resources. Such isolation allows some amount of failure to exist without bringing down the entire application or creating cascading issues — particularly useful in incident management.
Bulkhead design patterns have the inherent benefit of making services easier to understand and decompose team wide. Fortunately, this pattern comes with additional benefits related to supporting the application in production:
The isolation provides more context in the alert payload to better pinpoint the issue. It also allows responders to address issues in a way that does not impact the reliability and uptime of other functionality in the application.
If a human needs to be in the loop, pools can equate to teams, and teams generally serve as buckets for subject matter experts and alert destinations. So, the pools can assist in determining:
This is especially true when developers are expected to be on-call for their code. When there’s a failure, whoever is paged can use the pool as an indicator for which currently on-call developers are relevant to address the issue versus reaching out to anyone tagged as a backend or frontend developer.
When I imagine the sidecar pattern, I think of something a little more parasitic. This pattern is a great way to keep complementary components attached logically — but technically separate. This offers a variety of advantages for application support.
The sidecar pattern prevents service-related code from taking down the primary function of the service itself. This allows for rollbacks on the service (or the service-related code) independently so as to not impact each other.
However, it gives the service and companion app a direct connection to make it easier to consume I/O from one another. For example, perhaps a service has a platform extraction layer entry point. This layer could be separated such that, if the layer becomes unresponsive due to high-load, users of the service aren’t directly impacted.
Besides abstractions, sidecar is often used for monitoring tooling for that service. This has the benefit where issues with a monitoring tool cannot impact application functionality in the service they’re attached to. But, it also gives incident responders the benefit of not losing access to data from the monitoring tool if the services do come down.
This is a large shift from most architecture, where even monitoring for microservices applications is done in a monolithic way. Monitoring tools also often have agents and can benefit from being tightly coupled with the application. You don’t want those agents to bring down the service if they have an issue. But, you do want there to be data coming off the service and accessible as long as the service can produce it — even if it’s not functioning.
(Understand the four golden metrics of monitoring.)
And, like the bulkhead pattern, the isolation is both logical and technical. The logical benefit is that responders can stay focused on where the issue occurs and better bring in support where it’s needed; especially in cases where developers are on-call and need deeper access to monitoring tooling than would normally be the case for the broader application.
The list of modern design patterns is increasing so rapidly, it’s hard to keep up. Many others have direct correlation to supporting the application and addressing failures automatically or manually. I’m a big fan of the sidecar and bulkhead pattern as tools to improve application production support.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.