Last month I wrote about indexing video feeds and Vimeo was the site I featured for HD videos. The idea was to use the Vimeo REST API to gather all the meta data about your favorite Vimeo HD video channels and then index this into Splunk for historical look up or simply to have it available as a one stop dashboard where you can not only view the information that got indexed, but also use a workflow action to actually view the video.
Then, what happened was that the REST API called from Python changed in that I was getting one huge line per channel instead of nicely formatted XML. My code had logic to skip all lines with the words video, videos, or xml? in it. Naturally, the one huge line got skipped since it had these words and nothing got indexed.
I ended up changing the code in my scripted input to put a newline character in front of every occurrence of < and after every occurrence of > and then stripped out any blank lines. Now, the code would work as intended and the data would get indexed.
My main dashboard was still unpopulated. The problem was that field extractions used to populate the reports were no longer working since the raw format of the data changed. Thanks to Splunk’s late binding, which does not compel a field extraction to be based on a database schema, this was easy to fix. PCRE REGEX has a prefix, (?m), which can be placed in front of the REGEX to tell it that it is a multi-line REGEX. An example from the Vimeo app is:
(?m)<user_url>(?<user_url>.*?)</user_url>
This example will extract the user URL in between the XML elements. This will also work with the prior Vimeo raw format. The reason I did not use Splunk’s 4.3 spath command was to remain backwards compatible with Splunk 4.2. Also, the reason I did not use xmlkv was to allow drill down with Simple XML dashboards out of the box.
In conclusion, the flexibility of Splunk to ingest any text based data and change field extractions at search time let me see my list of videos from my Vimeo channels again using Splunk as my launch pad. Where will your data take you? Let’s find out at .conf2012. Register today.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.