Welcome to the Clara-fication series! I’m Clara Merriman. Through this blog series, I’ll share what I know about various search commands, knowledge objects, and other Splunk-related topics that might need some extra Clara-fication. All puns aside, I hope these posts are both informative and not too dry! Enjoy.
Show of hands, folks: who has used the pivot feature of your favorite spreadsheet software? It’s a pretty handy feature, right? You’re able to take a column from a table and create new columns with the values. You can create new values for those columns based on other fields in your original table. It’s fantastic. So, why can’t you do that with Splunk software? Actually, you can!
There are a few things you can do with the Search Processing Language (SPL) to manipulate parts of a table: you can use the transpose, xyseries, and untable commands, and there’s an additional method I will tell you about involving eval.
Right out of the gate, let’s chat about transpose!
This command basically rotates the table 90 degrees, turning rows into column headers, and column values into row items. This can be pretty nifty when you need to change the way an entire table is laid out in order to improve your visualizations.
This command has no required arguments and a few optional arguments: column_name, header_field, include_empty, and int. They are all simple to use and understand:
column_name defines the name of the first column in the “new” table, defaults to “column”
header_field defaults to “row 1”, “row 2”, etc., and is the field used to create the new columns headers
include_empty specifies if empty values will be included or not, defaults to true
int specifies how many column headers to create, defaults to 5. With the int argument, 0 is unlimited and you do not need to type int=10, just the number 10 will suffice, and it will bring the first 10 rows in the table.
Consider this search. It shows the throughput metrics for a group of hosts:
index=_internal component=Metrics group=thruput|stats sum(*kbps) as *kbps by host|transpose 0 column_name=metric header_field=host
By adding transpose to this search, the visualization groups the metrics together instead of the hosts, which allows for host over host comparison of the same metric. Without the transpose command, the chart looks much different and isn’t very helpful.
Here is what the chart would look like if the transpose command was not used.
Next, we’ll take a look at xyseries, a.k.a. maketable.
In xyseries, there are three required arguments: x-field, y-field, and y-data-field. Here’s a pro tip: you can have more than three values listed in the command. I’ll discuss that in a bit. To understand which order to put your fields in, you need to remember what each field means.
The x-field (first field listed) is the column; in other words, the x-axis field.
The y-field (second field listed) will become the new column values, otherwise referred to as new field names.
The y-data-field (third field listed) will be the data values in the table for the new fields you created.
Let’s look at a search and see how it all works together.
A simple stats command yields a table:
index=_internal | stats count by source component
By adding xyseries to that search, you can see that the values from the component column become columns, and the count field becomes the values.
index=_internal | stats count by source component|xyseries source component count
The cool thing about xyseries is that you can add more than one data field, as I mentioned earlier.
Let's start with the following search:
index=_internal | stats count count(eval(log_level="ERROR")) as error_logs by source component
The addition of xyseries to the end of this query yields a similar output to using chart or timechart with multiple aggregate fields. Columns are named by default <y-data-field>: <y-field>
index=_internal | stats count count(eval(log_level="ERROR")) as error_logs by source component|xyseries source component count error_logs
There are optional arguments for this command: format, grouped, and sep.
The sep option is what separates the y-data-field and the y-field in the column names. It defaults to “: “.
|xyseries source component count error_logs sep="-"
The format option, if set, will take precedence over sep, but allows you to reformat the column headers. When you use format, $VAL$ and $AGG$ are required in the string. They can be in any order, so that the y-field and the y-data-field names switch places, the separator can be changed or be removed all together, and you can add other text.
|xyseries source component count error_logs format="$VAL$ - $AGG$ TEST"
The grouped option makes this command either a distributable streaming or a transforming command. When this option is set to false, the default value, it is in the streaming command state, and will sort the x-field on the fly to display the last seen value. When you set grouped to true, the search parser expects that the data for x-field is already sorted, or at least grouped. You can see in these examples that if the results are sorted first, the results are different.
index=_internal component=* |fields source component _time|xyseries source component _time grouped=t|table source *
This search has multiple lines for the splunkd.log source, as well as other sources, because it was not sorted before xyseries ran.
index=_internal component=* |fields source component _time|sort 0 source component|xyseries source component _time grouped=t|table source *
This search is collapsed and grabs the first event for each source because it was sorted before xyseries. Keep in mind that you do not need to use |sort if you’re using a transforming command, such as stats, before xyseries, because the transforming command will already be sorting the data.
index=_internal component=* |fields source component _time|xyseries source component _time grouped=f
This search does not indicate grouped=t and was not sorted, and therefore it sorts at the same time xyseries runs.
The inverse of xyseries is a command called untable. You can actually append the untable command to the end of an xyseries command to put the data back into its original format, and vice versa. So using the examples we used above, we can add |untable source component count to the end of our first search and the results come back the same as they originally were. Amazing!
The arguments for untable are the same as the arguments for xyseries, except there are no optional arguments. The major difference is that you cannot use more than three arguments. So rebuilding our second search with error_logs would be more difficult, since we would only be able to do |untable source component count and our component column would house both count and error_log data points for all components.
index=_internal | stats count count(eval(log_level="ERROR")) as error_logs by source component|xyseries source component count error_logs format="$VAL$ - $AGG$ TEST" |untable source component count
From there, you could write some additional SPL to eval new fields for error_logs and count to rebuild the original output. If I had used |fillnull, I could have used that before |untable to bring in all values, as well, which is a great use case of using xyseries/chart commands and untable together.
However, this is not the only use case of untable! You can use untable to create a more condensed table from a table with many columns, essentially making field names into field values. This could help with displaying the data in various types of visualizations, including just a statistics table itself. You can also use untable to reformat lookup tables, charts, or timecharts; fill null values and format them back into columns; or format results into a more easily filterable table before formatting them into a chart. Untable really is a very helpful command with a lot of different uses.
Everyone knows how useful the eval command is. But there is more you can do with it! You can use curly brackets with the eval command to create new fields using the values of the provided field. The values of the new fields are then given as the eval statement, whether it is another field, or an if statement, and so on. It is important to note that you can also add a string to the field name, such as foo_{field}.
index=_internal component=Metrics group=queue|eval {name}_curr_kb=current_size_kb|timechart max(indexqueue_curr_kb) as max_index_curr_kb max(parsingqueue_curr_kb) as max_parsing_curr_kb max(typingqueue_curr_kb) as max_typing_curr_kb max(tcpin_queue_curr_kb) as max_tcpin_curr_kb by host
In the search above, we are creating new fields created from the existing field name concatenated with _curr_kb. An example of the new fields is indexqueue_curr_kb, because indexqueue is a value of the name field. The values of these new fields come from the current_size_kb field. The reason this command works here is that you cannot have multiple fields in the by command for a timechart, but you want to have the data split by the name and the host.
Alright, so what have we learned? There are numerous commands that can be used to configure the layout of a table: transpose, untable, xyseries (maketable), and eval {}. These commands are all very useful in their own ways and are great to know and utilize. Now that you’ve read this post, I hope these topics have been Clara-fied! Any topics you’d like to be Clara-fied, feel free to comment below and I’ll get to work on it. Thanks!
October 2019, Splunk Enterprise v7.3.1
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.