We are excited to announce a strategic partnership with Google Cloud to bring real-time observability into Google Cloud Services and modern applications for our joint customers. Cloud has become essential to modernizing IT environments and enabling the digital initiatives of organizations large and small. Organizations undertake IT modernization – including cloud adoption – to accelerate innovation and increase operational efficiency while optimizing IT spend. According to Gartner 1, for nearly a decade, cloud-first has been the prevailing approach to cloud adoption initiatives as organizations seek to drive digital transformation and modernization. Yet, in a study done by McKinsey, 80% of CIOs report that, despite cloud migration, they have not attained the level of agility and business benefits that they expected through modernization.
Merely “lifting and shifting” applications and data from on-premises services to cloud platforms in a large scale cloud migration does not unlock the full potential of the cloud. Companies also need to re-platform. For example, leveraging managed cloud services such as Google Datastore instead of maintaining their own NoSQL database in the cloud benefits organizations by avoiding the need to spend resources on undifferentiated overhead tasks. Furthermore, as companies mature in their cloud journey and re-architect their applications to microservices as well as embrace modern infrastructures such as containers, Kubernetes, and serverless, they stand to gain more benefits.
While modern cloud-native environments unlock the benefits of agility and efficiency, they also introduce complexity. As organizations continue on the journey to cloud-native, the volume of operational data they must deal with expands significantly. Ephemeral infrastructure and the distributed nature of cloud-native applications exponentially increase the cardinality of performance metrics as well.
Adding fuel to the fire, end-users’ expectations have never been higher. Applications need to provide a flawless end-user experience irrespective of the pressures placed on the system by varying load, sudden changes in traffic patterns, or other variables associated with scaling across devices, geographies, etc. According to a recent study by Akamai, a 100-millisecond addition to latency can hurt conversion rates by 7%. Companies often fall short if they retrofit traditional monitoring tools and strategies in modern cloud environments. What they need instead is real-time observability to gain insights on how their entire system is performing before the end user is affected.
Observability enables DevOps teams to understand distributed cloud systems: what’s slow, what’s broken, and – most importantly – “why” the system is behaving the way it is and what needs to be done to improve performance. The data sources commonly referred to as the three pillars of observability are metrics, traces and logs. Metrics prompt us if there is a problem, traces guide us where the problem might be occurring across the distributed system, and logs help determine the root cause of the problem. Splunk’s best-in-class observability brings together SignalFx Infrastructure Monitoring, SignalFx Microservices APM, and Splunk Cloud to deliver data-driven, actionable insights in real-time for every question, decision and action. In the following sections, we will discuss how Splunk’s metrics and tracing solutions provide value to our customers in Google Cloud. Please read more about getting data from Google Cloud sources to Splunk Cloud for end-to-end visibility into the logs in this accompanying blog.
SignalFx Infrastructure Monitoring is the real-time metrics solution that is purpose-built to address the needs of ephemeral cloud, containers, and serverless environments with high-cardinality at massive scale. Driven by our patented streaming architecture, our approach to ingest, store and retrieve data is fundamentally different from traditional batch and query solutions.
As metric data streams into SignalFx, metadata is separated from metric value data as they serve separate use cases – human-readable metadata is a central tenant in cloud-native environments to search, filter, sort and group, while metric values are analyzed by the SignalFlow™ engine and directly streamed to components that need them such as dashboards, alerts, and automation. Our streaming architecture means that our customers get insights and can take action in real-time – dashboards refresh, alerts fire and automation tasks trigger within seconds as compared to minutes+ with other solutions.
Our customers get complete flexibility in how they instrument and ingest data. Often, they choose to use a combination of the following:
SignalFx Smart Agent: Open source, lightweight node agent based on collectd that automatically discovers performance metrics for hosts, containers, Kubernetes, deployed applications, and services. The Smart Agent can automatically scrape and ingest metrics from services that emit Prometheus metrics such as Kubernetes, Google Cloud Run, or custom applications.
Google Operations Suite (formerly Stackdriver): Robust integration with Google Cloud Monitoring (formerly Stackdriver) to collect metrics on Google Cloud Services.
Istio on GKE and Anthos: Out-of-the-box visibility using an open-source telemetry adapter for Istio service mesh.
Wrapper functions: For services where an agent-based approach does not work such as Google Cloud Functions, SignalFx offers wrapper functions to visualize executions, errors, throttles, cold starts, duration, and resource utilization in real-time.
To deliver immediate value, SignalFx provides pre-built dashboards for Google Cloud Services such as Compute Engine, App Engine, GKE, Cloud Bigtable, Cloud Functions, Cloud Spanner, Cloud Storage, Cloud Pub/Sub Subscriptions, Cloud Pub/Sub Topics and more.
Adding custom charts to this dashboard for composite metrics is easy and takes just a few clicks.
Our recently launched Kubernetes Navigator enables DevOps and SRE teams to understand and manage the performance of containerized applications using an intuitive, out-of-the-box UI that navigates through the entire GKE environment. AI-driven analytics automatically suggests filters associated with performance anomalies to expedite troubleshooting.
SignalFx Microservices APM provides NoSample™ full-fidelity distributed tracing to observe and analyze every single transaction and capture all outliers and anomalies. AI-driven analytics and directed troubleshooting help DevOps teams to quickly identify and troubleshoot performance anomalies. In addition, SignalFx Microservices APM provides analytics with highly granular details to enable infinite-cardinality exploration. Users can perform breakdown analysis on everything of interest.
Rated as a visionary in the latest Gartner Magic Quadrant for APM, SignalFx Microservices APM uses lightweight, vendor-neutral instrumentation based on OpenTracing and OpenTelemetry to free DevOps teams from heavyweight and proprietary agent-based approaches used by traditional solutions.
Splunk’s observability solutions are SaaS services deployed on Google Cloud to give flexibility to our customers who wish to keep their observability data within Google Cloud. Splunk and Google Cloud are strategically aligned to accelerate your cloud success. Future-proof your observability investment with an enterprise-grade solution trusted by enterprises for the most advanced use cases at any scale.
Get started today with a 14-day free trial. Join us for our weekly live demo to see these solutions live in action.
Watch how the Planet team handled their Google Cloud migration, and how SignalFx helps ensure consistent, reliable operations of both cloud and Kubernetes for container orchestration.
.
1. Gartner, Move From Cloud First to Cloud Smart to Improve Cloud Journey Success, Henrique Cecci, 25 February 2020.
----------------------------------------------------
Thanks!
Amit Sharma
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.