If you’re working with microservices in a large distributed environment, you’ve probably got your monitoring and logging on lock, and you may even be lucky enough to have properly instrumented APM (distributed tracing) for consumer calls. But, did you know you’re likely still facing an observability gap?
How many incidents have you worked that required hours of sleuthing only to end with a single team needing to roll back a deployment? It’s more common than you may think!
As previously mentioned in the Splunk Blog post: Jenkins, OpenTelemetry, Observability unless you’re leveraging Events and Alerting to their utmost potential you’re likely missing a crucial element of your software lifecycle and CI/CD processes. Event based CI/CD data can mean the difference between minutes long MTTD/MTTR and hours!
With our new Azure DevOps integrations for sending Events to Splunk Observability and Alert based Release Gating your organization can start integrating CI/CD context into your monitoring practices!
Take a moment to ask yourself:
Events in Splunk Observability are highly visible, easily overlaid on dashboards with event markers / lines, and are essentially “free real estate” with no specific associated charges. Imagine how much more context you can include on your dashboard with Event Markers for deployment start, deployment success, deployment failure, etc. Not just for your own services, but also upstream services you depend on!
Figure 1-1. Quickly view CI/CD events from Azure Devops overlaid on your dashboard charts
Successful monitoring (and by association incident management) is all about context and communication! By helping teams to quickly establish if a deployment has impacted their service’s performance, or another service’s performance, your software teams can decrease Mean Time To Detect (MTTD) and Mean Time To Recovery (MTTR).
Less jumping between multiple tools and increased context in a single UI is the name of the game! But, there are further benefits to be had with another of our Azure DevOps integrations.
Try out the Splunk Observability Events integration from the Microsoft Marketplace!
As mentioned above there is great value in creating alerts and/or events related to software deployments. Knowing when your service or upstream services have been deployed is a vital signal for Development, SRE, CI/CD, and DevOps teams. But, the next logical step is to be proactive and start gating your releases based on Splunk Observability Alerts.
For example the configuration shown in Figure 1-2 below has three steps. First it sends an event to Splunk Observability on pipeline start so that deployment is marked in Observability, then checks our service’s alert to make sure it isn’t firing, and finally checks the upstream service’s alert to make sure it isn’t firing either, before finally moving through the rest of the release process.
Figure 1-2. Setup Deployment gates based on Splunk Observability Alerts
Gating releases based on alerting for your own services is a natural place for teams to start. But, this may be counterproductive during incidents involving your service. The last thing you need during a stressful incident is something preventing you from deploying a fix. In these cases gates in Azure DevOps pipelines can be easily ignored with a single click in the interface.
More useful in most cases is gating your deployment or release based on the health of upstream services that influence the health of your own software. Below you’ll find a list of examples that may make your release gating practices more effective:
Release Gates are a prime tool for protecting the integrity of your overall software environment. The ability to prevent further changes during ongoing incidents will help protect the availability KPIs of your service. Proactively preventing errant deployments during times of trouble will also likely make you some friends with your co-workers in incident management!
Try out the Splunk Observability Alert Gate integration from the Microsoft Marketplace!
As noted, events and alerts in Splunk Observability have no specific cost associated with their ingest, storage, or usage. This sort of “free real estate” goes untapped by most organizations but can provide enormous value to software teams, incident management, and CI/CD processes.
Events marking deployments, releases, and even infrastructure changes can provide much needed context to your monitoring.
Alerts, commonly used to indicate service health and notify on-call resources, can spread some of that contextual awareness of your software environment to your deployment pipelines.
But, alerts and events need not be constrained to internal factors. Alerting and event reporting based on other Splunk Observability products can help provide even more untapped context to your organization.
Start looking without, in addition to within, to get a more detailed understanding of the impacts a given change to SaaS service, vendor’s configurations, or external processes may impact your software!
Excited about context? Interested in getting more monitoring information in a single pane of glass and generally pushing your DevOps (or DevSecOps) Magic™ to the limit? Check out Splunk Observability!
You can sign up to start a free trial of the Splunk Observability Cloud suite of products today!
This blog post was authored by Jeremy Hicks, Observability Field Solutions Engineer at Splunk with special thanks to: Doug Erkkila, Adam Schalock, Todd DeCapua, and Joel Schoenberg at Splunk.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.