Splunk APM gets all its trace telemetry data through OpenTelemetry for any service written in Java, Node.js, .NET, Go, Python, Ruby, PHP, and C++. In addition, for applications written in Java, Node.js, and .NET, Splunk offers zero-configuration instrumentation - simply deploy the OpenTelemetry Collector on that machine and it would automatically instrument applications written in these languages to send trace data to Splunk.
Finally, Splunk APM can analyze the performance of any application written in any other language if it is instrumented in the OpenTelemetry standard.
When monitoring monolithic applications it’s important to know how individual lines of code consume CPU and memory resources. Splunk APM has built-in AlwaysOn code profiling to continuously monitor CPU and memory consumption of code written in Java, .NET, and Node.js. As part of Splunk Observability Cloud, engineers can view the logs and infrastructure metrics of monolithic applications to get a complete picture of those applications’ performance.
Splunk APM can detect services that take part in an application’s distributed traces even if those services are not instrumentation. For these inferred services, Splunk APM provides the RED metrics (Rate, Error, Duration), and shows them as a dot on the Service Map. If that service is running on public cloud infrastructure, as part of Observability Cloud, engineers can view that service’s infrastructure metrics alongside its RED metrics to understand its performance.
Cloud native applications consist of multiple microservices interacting with one another, and understanding the communication patterns between all the different services is difficult. The Service Map simplifies this by automatically plotting all the services on a graph and with a connection between two services only if one of them calls the other in a trace. One of the unique capabilities of the Service Map is the solid red dots that highlights service at the root cause of an issue, so that SREs can identify at a glance the problematic service.
Tag Spotlight groups together different traces based on attributes (tags) they have in common, such as the host they’re running on, version, or http errors (if they exist) and uses a visual representation to show errors and latency for each group of traces. With this global view, developers can more easily identify the cause of the problem since they can immediately identify what problematic traces have in common.
With Business Workflows, SREs can group together any combination of microservices that align to key functions performed by the backend (such as checkout, log-in, etc.). Once a Workflow is created, SREs can easily filter the Service Map to view the performance of that workflow, and set up alerts if performance degrades.
Splunk APM is unique in storing 100% of traces so that if an issue occurred, engineers can accelerate troubleshooting by simply looking at the traces that relate to that issue instead of having to wait for that issue to occur again or to reproduce it. With Trace Analyzer, SREs and developers can easily search for any trace based on any combination of their tag values (for example, they can search for all the traces from a specific user ID), or based on errors or latencies.
In today’s always-on world, customers need their request to be processed in seconds. Long waiting times lead to significant customer churn. To provide exceptional customer experience, you cannot afford to wait several minutes to detect issues. Splunk APM uses a streaming analytics engine to detect and alert on any customer issue within seconds.
In order to quickly identify whether a service is experiencing issues due to code performance, Splunk APM has built-in code profiling to continuously monitor CPU and memory consumption of code written in Java, .NET, and Node.js. Unlike many other code profiling solutions, Splunk’s code profiling analyzes running code continuously, so when an issue occurs, engineers can simply call up the profiling data they need with a click.
Service Centric Views provides developers with all the telemetry they need to understand the performance of services they own, including traces, profiles, logs, RED metrics, runtime metrics, infrastructure metrics, release events, and more. With all the data centralized in one dashboard, it’s easier for developers to correlate the data and understand issues in their services.
Related Content helps engineers correlate trace, profiling, infrastructure metrics, and log data for faster troubleshooting. When viewing traces and databases or looking at the Service Map, Splunk APM provides links to the log and metric data that correlate to the exact time frame and issue that the engineer is exploring.
One reason that developers are hesitant to change observability vendors is that each vendor requires their own instrumentation. OpenTelemetry is the de-facto open standard of instrumentation, and Splunk Observability Cloud is OpenTelemetry-native. With Splunk, developers have the peace of mind knowing that once they instrument their code with OpenTelemetry, they can send their data to any observability vendor without needing to re-instrument if they change tools or as they build new applications.
With OpenTelemetry, developers can add any tag to their services so that if there is an issue later on, they can search for traces based on their tag value. By default, tags are not indexed, and so do not consume a lot of compute resources. Some tags become more useful for troubleshooting over time, so Splunk APM gives engineering teams the ability to index tags for quick recall and analysis using simple point-and-click.
Splunk Training is the place for coursework on specific Splunk topics and learning paths to take you from novice to power user. Learn to analyze your applications, create rich reports and visualizations from scratch and more.
Your success is our top priority. Splunk offers a variety of Support and Professional Services options that address your business needs and help you harness the value of Splunk