Controlling Trace Metadata in Splunk APM

By Scott Stewart

All companies must strive to protect their users’ sensitive data. Data security is becoming even more important as traditional industries such as health, finance, and telecommunications migrate to the cloud. These companies want to take advantage of monitoring and performance management, but must also work within the laws and existing standards that govern sensitive data and PII (Personally Identifiable Information).

Metrics-based monitoring systems don’t typically run into these kinds of data security challenges, because metrics are naturally aggregated and anonymized with counters, gauges, and histograms not being directly related to any particular customer or user. But distributed tracing, such as Splunk APM shows the details of a particular transaction or operation with the potential for very detailed metadata. It is therefore important to be aware of what data is being included on your trace spans.

Splunk APM is powered by the Smart Gateway, which runs within your VPC. As part of our NoSample™ architecture, the Smart Gateway ingests all of your spans generated from distributed tracing. These spans give the Smart Gateway a complete picture of your environment that it can use to create metrics, analyze trace paths, and understand your service architecture. Because the Smart Gateway sees all of your spans, it can also help you protect your data using span metadata obfuscation or removal. This feature allows you to obfuscate or remove specified tags from any spans that contain sensitive data or PII before they leave your environment.

Setting Up Span Metadata Obfuscation

Span Metadata Obfuscation can be extremely useful when you’re using auto-instrumentation to get traces from your code. This could include an agent running alongside your code, such as a Java agent running in a Spring application or a Python agent in your Django client. Or you may have tracing enabled for third-party code, such as a database. In both of these cases, it is difficult or impossible to change what information is being captured in a trace. If there is a URL param or a database query that contains sensitive information, there is no easy way to exclude that information from a trace — until now.

To set up Span Metadata Obfuscation, all you have to do is add a ObfuscateSpanTags section to your Smart Gateway configuration file. This config allows you to define a service and operation to match, along with a list of tags that should be obfuscated. The service and operation can be an exact match, or they can be matched with a wildcard match using “*”.

“ObfuscateSpanTags”: [
  {
        “Service”: “auth*”,
        “Operation”: “*user”,
        “Tags”: [“password”, “authToken”]
  },
  {
        “Operation”: “executeQuery”,
        “Tags”: [“SSN”]
  }

Of course, Span Metadata Obfuscation can also be useful if you have manually instrumented your code. Because all of your traces pass through the Smart Gateway, adding an ObfuscateSpanTags config could be much easier than trying to track down multiple engineers who are responsible for different parts of the code.In this example config, the Smart Gateway is matching all services that start with “auth”, and any operation that ends with “user”, e.g. authCache:getUser and authService:createUser would both match this rule. For the matching spans, the Smart Gateway would replace the value of the “password” and “authToken” tags with “<obfuscated>”. The Smart Gateway is also matching the operation “executeQuery” coming from any service, and obfuscating the “SSN” span tag if it is present.

In this example, we want to obfuscate which version of the Jaeger client library we are using. Instead of finding every place in our code that a span is created, the Smart Gateway is able to handle obfuscating the tag with a simple config.

“ObfuscateSpanTags”: [
  {

        “Tags”: [“jaeger.version”]
  }
]

Removing Span Tags

The Smart Gateway is also able to completely remove tags from a span. For example, if you decided that obfuscating the Jaeger client version wasn’t enough, and that you didn’t want the jaeger.version tag at all, you could add a RemoveSpanTags config. This config follows the same matching rules as ObfusateSpanTags, but it does what the name implies, and completely removes the tag from the span.

“RemoveSpanTags”: [
  {
        “Tags”: [“jaeger.version”]
  }
]

Protecting Your Data Inside the Smart Gateway

When the Smart Gateway ingests a span, it obfuscates or removes the configured tags as one of the first steps in the processing pipeline. This means that the Smart Gateway cannot use any of the configured tags as it processes your span, including generating internal metrics or making retention decisions. The Smart Gateway also removes the configured tags before any spans are buffered to disk, meaning that the Smart Gateway will never persist your sensitive data.

As businesses migrate to the cloud, they require a monitoring solution that gives them visibility into how their code is running, but must also be careful about where and how their sensitive data is being handled. Splunk APM gives you the ability to find which traces matter, troubleshoot issues, and enhance performance, while also ensuring that you can continue to protect your and your customers’ data.

R equest a demo to see Splunk APM in action.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.