Updated 10/26/23: The following process will no longer work for Zoom since webhook validation is now required. Please see the Zoom announcement here.
A few weeks ago, one of our customers that has become more reliant on web-based collaboration technologies was deploying the Splunk Remote Work Insights (RWI) Executive Dashboards. They needed to onboard operational data into Splunk from Zoom, and because of their size, the rate and amount of data we needed to collect was pretty vast.
The customer already had Internet facing Heavy Forwarders with open HTTP Event Collector (HEC) ports to ingest various data sources. We installed and configured the webhook-based Splunk Connect for Zoom, but we soon realized we were missing more than 50% of the Zoom events. With around 500,000 employees, one can simply imagine the number of Zoom events per day, especially since so many of their employees were in a mandatory work from home situation.
After doing some research I learned that the majority of webhooks perform a HTTP POST with a JSON, XML, or form data content-type. Zoom is no different: when you create a webhook-only app in Zoom, Zoom will send an HTTP POST request payload to the specified endpoint URL. Unfortunately, Zoom only allows one to set the endpoint URL and does not allow one to specify any authentication methods such as a HEC token. The output JSON format is also not customizable so one is not able to use the “collector” endpoint for HEC.
I knew HEC allows for the option to expose a “raw” endpoint that allows us to POST unformatted events, but I still needed to find an authentication solution. I started reviewing Splunk’s HEC documentation and realized there is a parameter that allows one to embed the token for authentication as part of the URL: allowQueryStringAuth. This solution works for both Splunk Enterprise (on-prem) and Splunk Cloud.
I created the following new HEC input (inputs.conf) on one of the Heavy Forwarders:
[http://zoom] token = <a random guid> indexes = scratch index = scratch sourcetype = zoom:webhook allowQueryStringAuth = true disabled = false
(Note: If you are a Splunk Cloud Customer, you must open a Splunk Support ticket to set allowQueryStringAuth to true on your HEC endpoint.)
I next created a Zoom Webhook Only App following the same instructions listed on docs.splunk.com, but with one key change...I used the following endpoint URL:
https://externalsplunkinstance.yourdomain.com:8088/services/collector/raw?token=<a random guid>
I then ran a Splunk search and was pleased to see that our events were being received! After a few days we also confirmed that we were no longer missing Zoom events. Since the format and the sourcetype is the same as Splunk Connect for Zoom, we were still able to use Splunk App for Zoom for our visualization needs as well as the RWI Executive Dashboards.
It isn’t just Zoom that allows us to take advantage of this capability to receive webhook posts using HEC. Taking this discovery one-step further, I was setting up Plex at my home and noticed an area to configure webhooks. I used the same method as above and configured Plex to send webhooks to my Splunk setup. I picked a random movie, pushed play, and noticed that the data looked a little different as Plex sends a JSON payload with a form-data Content-Type.
----------------------------225309989493785122838026 Content-Disposition: form-data; name="payload" Content-Type: application/json {"event":"media.pause","user":true,"owner":false,"Account":{"id":123456,"title":"username"},"Player":{"local":false,"publicAddress":"123.123.123.123","title":"Plex Web (Chrome)","uuid":" /P7vNM5bIEa4LOSJ1qdhEw=="},"Server":{"title":"Vod","uuid":"tv.plex.provider.vod"},"Metadata":{"art":"https://image.tmdb.org/t/p/original/k8sRDJV5CFx91N4gPXh57dthPvx.jpg","attributionLogo":"https://provider-static.plex.tv/vod/partners/logos/crackle.png","guid":"plex://movie/5d9f351fca3253001ef27f1d","key":"/library/metadata/5e911fe10cf8cd003e286fc4","rating":6,"ratingCount":1984,"ratingKey":"5e911fe10cf8cd003e286fc4","title":"Thick as Thieves","titleSort":"thick as thieves","type":"movie","thumb":"https://image.tmdb.org/t/p/original/sgRY2ie8koJxfOScMuvzHQ9TuZX.jpg","duration":6222230,"viewCount":0,"viewOffset":0,"indirect":true,"contentRating":"R","ratingImage":"imdb://image.rating"}} ----------------------------225309989493785122838026--
I was able to strip away the form data using very simple props/transforms, leaving me with a clean JSON object.
props.conf [plex:webhook] TRANSFORMS = plex_webhook_clean transforms.conf [plex_webhook_clean] REGEX = (\{.+\}) FORMAT = $1 DEST_KEY = _raw
Using the webhook events from Plex I was able to easily tell who in my household (or friends) like to binge watch.
Another example of collecting webhook data from cloud based services is IFTTT — see the recent blog here that uses Arlo’s recipes for IFTTT to post security camera activity over a webhook.
Now, we do need to mention a few security-related caveats…
In summary, the majority of webhooks perform a HTTP POST with a JSON, XML, or form data content-type. Splunk can receive webhooks using the “raw” HEC endpoint using allowQueryStringAuth = true for authentication. If the data needs some cleaning, you can use props/transforms to remove unnecessary characters.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.