Platform

February 28, 2023

4 Minute Read

Bring More ML to Splunk: Inference Externally Trained ONNX Models in MLTK 5.4.0

By Poonam Yadav

Splunk is committed to using inclusive and unbiased language. This blog post might contain terminology that we no longer use. For more information on our updated terminology and our stance on biased language, please visit our blog post. We appreciate your understanding as we work towards making our community more inclusive for everyone.

The latest release of the Splunk Machine Learning Toolkit (MLTK) enables users to upload their pre-trained models in MLTK with a simple UI. Once the model is in Splunk, users can use the model with their Splunk data with no modification to their existing workflows. This capability extends the usability of MLTK and ML-SPL beyond models trained using MLTK, unlocking a huge use case of using external models with data inside Splunk. MLTK 5.4.0 is available in GA for both Splunk Cloud Platform and Splunk Enterprise customers.

MLTK is an easy way for Splunk customers to get started with machine learning. The app provides Showcases and Assistants that guide the user through a series of steps to train, assess and operationalize ML models. The app provides backend ML components with a frontend app experience, abstracting away the complexity of data science notebooks and actual code. MLTK empowers users to leverage machine learning in their Splunk workflows using ML-SPL commands - fit for training ML models, and apply for running inference. MLTK is very popular among Splunk customers and serves very important machine learning use cases, such as anomaly detection, forecasting and clustering. It is one of the most downloaded apps on Splunkbase with over 185K downloads. Additionally, the fit and apply commands that are bundled with MLTK are used millions of times every month by our customers.

While customer demand for ML has grown rapidly, many Splunk customers have not been able to incorporate ML as a part of their Splunk user journeys. Most MLTK customers want to bring new algorithms or pre-trained models into MLTK. As per our telemetry data, 80% of algorithms run in Splunk are customized. However, users find it very challenging to create ML models and ship them to use in Splunk with their Splunk data. To use their external models in Splunk, users need to convert the models to MLTK supported codec format and import custom Python scripts with root permissions, which can be a very time-consuming and tedious task. This is a huge pain point for our customers who regularly ask for a better way to solve this issue.

MLTK 5.4 solves the above-mentioned challenges with the option to upload externally-trained ONNX models for inferencing in MLTK. Users can train their models in their preferred third-party environments, save the models in ONNX format, upload the models to MLTK and inference them in MLTK with their Splunk data. This way, users can offload the process-heavy model training outside the Splunk platform but still benefit from the operationalization within the Splunk platform using their Splunk data.

The uploaded model goes through a series of validation steps including validation that the user has the required model upload capabilities, and that the model is the correct file format. After validation and verification, Splunk’s REST API is used to store the model in an MLTK accessible location within Splunk. Users can then use the model with their Splunk data in the same way they would use a model created with MLTK, which is a workflow that users are already familiar with. This way, users can focus on the important task of creating and training their ML models and offload the complexity of bringing their model into Splunk on MLTK.

Users Can Now Leverage ONNX in Splunk

Prior to this release, users were limited to using the ML algorithms and libraries packaged with MLTK. With the 5.4.0 release, however, MLTK supports inferencing ONNX models. ONNX stands for Open Neural Network Exchange and is a common format for machine learning models. This format lets you create models using a variety of machine learning frameworks, tools, runtimes, and compilers. Thus, users can now take advantage of a wider range of libraries for training their models, such as TensorFlow, PyTorch, Keras, Matlab, among others.

Uploading and Inferencing Models Workflow

The UI for uploading an external model is simple and intuitive. It requires a few parameters from the user that are helpful for verification of the model file and are also used during model inference.

Running model inference is the same workflow that MLTK and ML-SPL users are already very familiar with. There is one minor difference - the model name after the apply command needs to have the onnx: prefix. This tells MLTK and the apply command that the model being used for inference is an ONNX model.

User Permissions and Configurations

The MLTK team is very mindful of security concerns and has taken steps to ensure that only users with the appropriate permission to upload model files to Splunk can upload such files. By default, the ability to upload models will be disabled for all users. Splunk admin will need to grant special permission to users to be able to upload model files to Splunk.

More Powerful Anomaly Detection Capabilities

In addition to the pre-trained ONNX model capabilities, MLTK 5.4.0 also extends the anomaly detection capabilities available to users with the addition of a new algorithm for multivariate outlier detection. Users can now provide a multivariate dataset as input to the new MultivariateOutlierDetection algorithm which performs a series of steps internally to return outliers in this dataset.

Next Steps

MLTK 5.4.0 is available today on Splunkbase for use with Splunk Cloud Platform as well as with Splunk Enterprise. For more information on how to use this feature, refer to the MLTK documentation. To get started with this new version today, visit Splunkbase.

Resources

Greater Self-Service Private Apps on Cloud with New AppInspect Tags

Introducing architecture-dependent AppInspect tags that allow more apps to be self-service managed on all Splunk Cloud Platform deployments.

Platform 1 Min Read

Splunk Embarks on AWS Graviton Journey with Amazon EC2 Im4gn and Is4gen Instances

We're excited to announce that Splunk Cloud Platform is moving to next generation AWS Graviton2 processor hardware to help enable enhanced performance for customers who choose AWS as a provider.

Platform 2 Min Read

Stream Your AWS Services Metrics to Splunk

Amazon Web Services (AWS) recently announced the launch of CloudWatch Metric Streams. Cloudwatch Streams can stream metrics from a number of different AWS resources using Amazon Kinesis Data Firehose to target destinations. What this means for current Splunk customers is they now have the option of either using the Splunk add-on of AWS to poll metrics or to make use of this new service and let Amazon Kinesis Data Firehose push metrics to a Splunk HEC endpoint, and reduce their latency by anywhere between 5 to 10 minutes.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram