I’ve been asked a few times on how best to search for events which may contain many different discrete values for a field. It’s essentially using an OR (disjunctive search) in the search language. For example, you can do this:
sourcetype=my_sourcetype (planet=mars OR planet=earth OR planet=saturn)
This works fine for a finite case where you only have a handful of planets, but what happens if the field’s possible search criteria changes daily and may contain hundreds of possible values that you would like to input for the search? Certainly, using OR terms with over a hundred entries sounds impractical. A solution is to have an external file that contains all the possible values that you would like to use in the disjunctive search be used within the search language as input to the search criteria. With Splunk 4.0, one way this is possible out of the box is with the new lookup
command. For an introduction to this command, please consult Bob Fox’s blog entry discussing example usage. For now, I will assume you have basic knowledge about its usage and I will list a possible solution for trying to use OR
with many possible values for a field.
First, use field extraction to extract the field in question. For our example I’ll use an ip address field. Next, create a CSV file in your SPLUNK_HOME/etc/app/<app_name>/lookups/
directory. I created iptable.csv with the following sample content to be used for input.
ip, myip
192.168.1.105, 192.168.1.105
10.10.10.2, 10.10.10.2
192.168.1.10, 192.168.1.10
Since I’m not interested in creating a real mapping from one field (ip) to another (myip), I used the same value in both columns to conform to the syntactical usage of the lookup
command. Now, in your SPLUNK_HOME/etc/apps/<app_name>/local
directory you’ll need to create or modify two files. First, edit transforms.conf
.
[search_ip]
filename = iptable.csv
Second, edit props.conf and use your sourcetype to start the stanza. I am using mail as my sourcetype.
[mail]
lookup_ip = search_ip ip OUTPUT myip
Now, from your browser, log into Splunk and reload the props.conf and transforms.conf file for your new additions:
sourcetype=mail | extract reload=true
You are now ready to use your file as input to search for all events that contain ip addresses that were in your CSV file. One possible search is:
sourcetype=mail | lookup search_ip ip OUTPUT myip | search myip=*
The last search command will find all events that contain the given values of myip from the file. In essence, this last step will do your disjunctive search for you without having to type in a long sequence of OR
terms. Finally, if your requirement is that you want to search on the top N (N is an integer) values for a field each day, Splunk can help you create the CSV input file. Simply run the following search assuming you want the top 100 values for IP in our example:
sourcetype=mail | top limit=100 ip | fields + ip
You can then copy and paste the the values into your CSV file. In short, today’s blog entry gave you one possible way to use the content of a file for input for your disjunctive search. There may be more approaches and you are welcome to discuss them in the comments.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.