I was recently talking to a customer and he mentioned how he needed to do a search in Splunk for events that matched something in a list provided to him in a Microsoft Excel spreadsheet. Basically, he needed to search for events in Splunk that matched a list of IP addresses in the spreadsheet. In order to make the search, he took each item from the spreadsheet and added it to a search. Needless to say, this is painful.
In Splunk, there is a better way.
As a former security practitioner, I had been there and have done similar things. Let’s discuss some things that you can do in Splunk that makes this drop dead easy.
In this post I’m going to cover:
A Graphical User Interface for Editing and Importing Files
First, I highly recommend checking out the lookup editor app. That app is free and it allows you to make new lookup files and edit them in an nice interface. If you want to import a spreadsheet from Excel, all you have to do is save it as a CSV and import it via the app.
To do so, open the Lookup Editor and click the “New” button. Next, click “import from CSV file” at the top right and select your file. This will import the contents of the lookup file into the view. Press save to persist it.
Manipulating Lookup Files in Search
Another thing the customer mentioned to me was that the he needed to cleanup and fix some things in the lookup file before he could use it. He manipulated it manually but the search interface is a great way to modify CSV files.
Let’s consider an example where I want to split out an email address field in a CSV file into separate domain and local fields (e.g. make “john.doe@gmail.com” into “john.doe” and “gmail.com”). To start, I’ll display the lookup file in search using the inputlookup search command (assume I already imported it via the Lookup Editor App):
| inputlookup email_addresses.csv append=t
This results in output that looks like:
first_name | last_name | |
---|---|---|
jdoe@gmail.com | James | Doe |
josephine_darakjy@darakjy.org | Josephine | Darakjy |
I then use the rex search command to split out the local and domain portions:
| inputlookup email_addresses.csv append=t | rex field=email "(?<local_portion>.*)@(?<domain_portion>.*)"
This yields the following:
first_name | last_name | local_portion | domain_portion | |
---|---|---|---|---|
jdoe@gmail.com | James | Doe | jdoe | gmail.com |
josephine_darakjy@darakjy.org | Josephine | Darakjy | josephine_darakjy | darakjy.org |
Now that I have the output what I want, I rewrite it out to a new lookup file (email_addresses_2.csv) using outputlookup:
| inputlookup email_addresses.csv append=t | rex field=email "(?.*)@(?.*)" | outputlookup email_addresses_2.csv
There are many other things you can do with search language and manipulation of CSV files (stripping our rows and columns, reformatting cells, e.g.) so be creative.
Searching Using Lookup Files
Finally, I can use the lookup file to find events that match the given lookup file. I’m using the join command to filter down the events to those that match the lookup:
| search * | join domain [| inputlookup append=t email_addresses.csv | eval domain=domain_portion]
Make sure to change “| search *” to match the events you want to look for matches in (a “*” search is probably a little unnecessary).
Conclusion
If you takeaway nothing from this article, take this: Splunk can do some crazy stuff with lookups. If you find yourself having to do lots of manual work with lookups files, then look for a better way. If you don’t see a good way to do what you want to do, post a question on answers.splunk.com and some of us will help you out.
----------------------------------------------------
Thanks!
Luke Murphey
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.