Last month I wrote about indexing video feeds and Vimeo was the site I featured for HD videos. The idea was to use the Vimeo REST API to gather all the meta data about your favorite Vimeo HD video channels and then index this into Splunk for historical look up or simply to have it available as a one stop dashboard where you can not only view the information that got indexed, but also use a workflow action to actually view the video.
Then, what happened was that the REST API called from Python changed in that I was getting one huge line per channel instead of nicely formatted XML. My code had logic to skip all lines with the words video, videos, or xml? in it. Naturally, the one huge line got skipped since it had these words and nothing got indexed.
I ended up changing the code in my scripted input to put a newline character in front of every occurrence of < and after every occurrence of > and then stripped out any blank lines. Now, the code would work as intended and the data would get indexed.
My main dashboard was still unpopulated. The problem was that field extractions used to populate the reports were no longer working since the raw format of the data changed. Thanks to Splunk’s late binding, which does not compel a field extraction to be based on a database schema, this was easy to fix. PCRE REGEX has a prefix, (?m), which can be placed in front of the REGEX to tell it that it is a multi-line REGEX. An example from the Vimeo app is:
(?m)<user_url>(?<user_url>.*?)</user_url>
This example will extract the user URL in between the XML elements. This will also work with the prior Vimeo raw format. The reason I did not use Splunk’s 4.3 spath command was to remain backwards compatible with Splunk 4.2. Also, the reason I did not use xmlkv was to allow drill down with Simple XML dashboards out of the box.
In conclusion, the flexibility of Splunk to ingest any text based data and change field extractions at search time let me see my list of videos from my Vimeo channels again using Splunk as my launch pad. Where will your data take you? Let’s find out at .conf2012. Register today.
The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.
Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.