My Data takes me back to HD Videos

By Nimish Doshi

Last month I wrote about indexing video feeds and Vimeo was the site I featured for HD videos. The idea was to use the Vimeo REST API to gather all the meta data about your favorite Vimeo HD video channels and then index this into Splunk for historical look up or simply to have it available as a one stop dashboard where you can not only view the information that got indexed, but also use a workflow action to actually view the video.

Click on Show Video

Then, what happened was that the REST API called from Python changed in that I was getting one huge line per channel instead of nicely formatted XML. My code had logic to skip all lines with the words video, videos, or xml? in it. Naturally, the one huge line got skipped since it had these words and nothing got indexed.

I ended up changing the code in my scripted input to put a newline character in front of every occurrence of < and after every occurrence of > and then stripped out any blank lines. Now, the code would work as intended and the data would get indexed.

My main dashboard was still unpopulated. The problem was that field extractions used to populate the reports were no longer working since the raw format of the data changed. Thanks to Splunk’s late binding, which does not compel a field extraction to be based on a database schema, this was easy to fix. PCRE REGEX has a prefix, (?m), which can be placed in front of the REGEX to tell it that it is a multi-line REGEX. An example from the Vimeo app is:

(?m)<user_url>(?<user_url>.*?)</user_url>

This example will extract the user URL in between the XML elements. This will also work with the prior Vimeo raw format. The reason I did not use Splunk’s 4.3 spath command was to remain backwards compatible with Splunk 4.2. Also, the reason I did not use xmlkv was to allow drill down with Simple XML dashboards out of the box.

In conclusion, the flexibility of Splunk to ingest any text based data and change field extractions at search time let me see my list of videos from my Vimeo channels again using Splunk as my launch pad. Where will your data take you? Let’s find out at .conf2012. Register today.

Nimish Doshi

Nimish is Director, Technical Advisory for Industry Solutions providing strategic, prescriptive, and technical perspectives to Splunk's largest customers, particularly in the Financial Services Industry. He has been an active author of Splunk blog entries and Splunkbase apps for a number of years.

.conf & .conf Go 1 Min Read

Best Practices from Leading SecOps Teams – Recap of .conf Go EMEA

Take a deep dive into best practices leading SecOps teams shared at .conf Go EMEA.

.conf & .conf Go 6 Min Read

Splunk Platform: Innovation to Thrive in a Hybrid, Multi-Cloud World of Data

Announcing the latest enhancements to Splunk Cloud Platform and Splunk Enterprise, propelling Splunk’s customers into the next wave of digital transformation.

.conf & .conf Go 3 Min Read

Splunk Education, From Live to Live-Stream at .conf24

With Splunk University and .conf24, there's no better time to invest in your professional development and take your Splunk expertise to new heights.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.