A Picture is Worth a Thousand Logs

By Josh Cowling

Splunk is a fantastic platform for ingesting, storing, searching and analysing data from logs, metrics and traces from a massive variety of sources. But does that mean we should ignore all of that data that doesn’t fall into these categories, like image and video data for example? Of course not!

Images, audio and other machine-readable “binary” data sources are likely to comprise valuable sources of information in many situations, if only we could find a way to make that data accessible in a meaningful way.

This might seem like a difficult task, but decades of research and development have resulted in a number of systems and services which can take that kind of data, and begin to interpret it in meaningful ways. In this blog, I’d like to tell you about one such approach, that can make machine-learned image metadata available to Splunk.

Rekognition is an AWS computer vision service. It allows you to take images or videos stored in AWS S3 and generate various types of metadata about that data or data stream. You might use it to detect faces in an image, and tell if they look happy, or grumpy. You might want to track the appearance of a celebrity in a video, or you might use it to detect and label particular objects within a scene.

The business value of this type of data comes from bringing context to image or video based data sources. For example, custom label detections could be used to implement a camera-based QA process for a manufacturing line with the data streamed directly into Splunk, perhaps monitoring of a camera feed can allow you to analyse foot traffic patterns in a logistical setting and improve efficiency, or it may allow you to analyse content uploaded to your web app at scale. There are plenty of “wilder” ideas as well though, for example, you could imagine being able to implement systems to detect and track wildlife from public webcams.

Today though, my goal is simpler. I’m going to try and work out what “VP for Marketing Special Ops and Catering” at the Splunk T-Shirt Company - Shelly Kornbloom, is all about using the power of computer vision.

The Man, the Myth, the Legend.

Shelly Kornblum

I’ve created a couple of fun projects using this kind of data with the help of some talented colleagues. At .conf19 you may have seen our “Splunk Your Face”-photobooth, which took photos of attendees and created some creative statistics about Splunkers who took part, such as “how beardy are attendees on average?”. This year I used my laptop camera to track my mood for a week when working, telling me when I looked happy, sad, or confused which for me was a pretty good look at my own mental health in the time of COVID.

Creating the data pipeline into Splunk to support processing and ingestion of image metadata is remarkably simple if you have some experience with AWS and Boto, the AWS Python SDK.

For my own projects, a simple local python script was used to capture images from a local camera and upload them to a secure S3 bucket. A lambda function is triggered on files being placed in the bucket, which takes the reference to each of the images, and sends them into the Rekognition service.

Rekognition

Once the response (which handily comes packaged up as a JSON object) has been gathered back from Rekognition into our lambda function, we can forward that on to Splunk via HTTP Event Collector (HEC).

The data returned from Recognition face API contains a whole lot of metadata about the content of the image. In the examples below, we’ve taken excerpts from a few dashboards we’ve created to analyse images processed by our system.

The DetectLabels API endpoint tells us all of the other labels that the algorithm thinks it can apply to each image, in this case, we’ve used the word cloud app to show our labels with relative sizes and colours that relate to our confidence in the label, and we can render that in our dashboard next to the image itself:

Face rendering

The DetectFaces API detected faces in our test images, estimated age, and tried to tell you what emotions are being displayed by the face in question. So what is Shelly really thinking?

Shelly Kornblum mood detection

I think our image of Shelly here is the key to identifying his motivations, though his single-minded bright confidence is projected flawlessly, there’s more than a small amount of confusion under the surface. Seems about right to me!

And with that, we can now ingest images into an S3 bucket and search and reference their metadata in Splunk. For the projects I’ve been involved in, we’ve gone a little bit further with a few more tweaks:

Re-organise the JSON objects sent from our lambda function into Splunk, to make them a bit easier to work with using Splunk’s aggregation functions, such as “| stats”.
Use a Python library “PIL” to draw annotations (like the boxes you see in the images above, create cropped versions, and convert the images into base64 encoded text.
Send base64 encoded images directly into Splunk, right alongside their metadata! With some creative JavaScript, you can turn these strings back into images in your dashboards! Super Cool!

Hopefully, this example has given you some ideas of your own, I wanted to show you that we needn’t be limited to “classic” machine data and traditional data sources. In Splunk’s Data-to-Everything platform the sky is the limit!

Happy Splunking,

Josh

Josh Cowling

Josh is a technologist, consultant, and entrepreneur based in London. Holding a PhD from Durham University's School of Engineering and Computing Sciences, he has wide experience spanning start-ups and enterprises in research, engineering, consulting, and pre-sales roles. While his background includes research, Josh is primarily focused on understanding, developing, and deploying new technologies that solve real problems and deliver tangible value. Connect with Josh on LinkedIn, especially if you have an interesting challenge in domains like cybersecurity, Splunk, data science, or machine learning.

Platform 2 Min Read

Splunk Cloud DIY: Even More Self-Service Options to Manage Your Cloud Environment

Splunk Cloud has been rapidly evolving to deliver richer capabilities and meet your unique business needs, including easier-to-use self-service capabilities

Platform 3 Min Read

Removing Python® 2 from New Splunk Cloud and Splunk Enterprise Releases Starting Fall 2021

Python 2 will be removed from all new Splunk Cloud and Splunk Enterprise releases starting Fall 2021. Learn how to confirm full Python 3 app readiness for confidence in migrations.

Platform 3 Min Read

Flatten the SPL Learning Curve: Introducing Splunk AI Assistant for SPL

At .conf23, we announced the preview release of Splunk AI Assistant - Splunk's first offering powered by generative AI.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.