At Splunk .conf22 on the last day of the conference, Christian Crisan and I conducted a live session called Modern Ways to Detect Financial Crime. The session started with Christian picking my pocket for a wallet as I walked up on stage to show that financial crime is everywhere. Multiple .conf22 attendees in the audience then proceeded to use the supplied microphones to give their stories on identity theft, fraud and account abuse.
We were told from our conference administration that what happens in Las Vegas stays in Las Vegas so I will not be able to share the stories here nor provide insight into the banter and quips about financial crime as the session continued.
Fortunately, on a more serious note, I will be able to provide you with the outline of the session and the steps your company can follow to improve their financial crime detection capabilities. What follows are the incremental steps for achieving better detection posture through Splunk products such as Splunk Enterprise, Splunk Cloud and Splunk Enterprise Security.
In the past, we used to call this criminal behavior fraud and it is still called fraud. However, financial crime is a superset of fraud as it includes fraud, account abuse, money laundering, embezzlement, sanctions violations and a host of other things. Each of these by themselves may not be obvious fraud, but they lead to the general category of financial crime.
We spent the first few minutes of the session providing some examples of Financial Services Industry (FSI) Crime. For instance, the image below is an example of a courier scam when an unknown email sender asks you to click on a link to deal with a fraudulent delivery issue. The Phishing scam currently is played out on cell phones with text messages as well.
We could have spent the whole session going back and forth with the attendees on modern financial crime examples, but then we would not have presented the real focus, which is detecting financial crime.
Almost a decade ago, I wrote this approach for detecting fraud using Splunk Processing Language (SPL) and creating rules to detect fraud with a single search. Since then, this approach has been utilized across multiple industries. The advantages for using any time series data in any format were also documented as opposed to extract, load and translate efforts in the past.
To sum up the approach, here it is again and you can get more examples from the Splunk Essentials for the Financial Services Industry app:
At first glance this seems simplistic since someone could get an alert just for a changed browser rule. To overcome that, I suggested that rules be combined with AND (implicit in Splunk) and use where clauses to filter out exceptions and thresholds. This makes for more complex searches that still make every positive an alert with no differentiation between one rule and another making it impossible to triage to respond in a timely manner. Introducing alert fatigue is not our goal.
Former Splunker Haider Al-Seaidy came up with an approach that adds weighted risk scores for each rule. The accumulated value of all the risk scores would tell an analyst that a crime is certain and it allows the ability to triage events so that a single change of the OS browser or address change does not trigger a pot of false positives to investigate. Here’s an example of accumulating risk scores for account takeover events so that analysts can be certain if this is indeed the case because the accumulated risk score is over a threshold.
Notice that any one notable event rule may not be nefarious, but the sum of the scores for all the rules comes up with a total score that gives confidence that the user, customer, account ID, institution or entity has been compromised. It also allows the analysts to triage and work on the highest scoring indicators first for a use case. The same approach could be used for wire fraud rules, ATM fraud rules, credit card fraud rules, payment fraud rules and so forth.
Let me put it in technical terms. What Haider suggested is to make each rule a saved search in Splunk that can be executed all at once returning one user defined risk score per search that can be multiplied by a weight for importance. The grouping of all rules can be done by username, customer, account ID, institution or whatever entity is a common way to identify the account. In SPL, this would be like this:
index=transactions |fields username | dedup username
| join type=left username
[savedsearch RiskScore_ConnectionFromRiskyCountry]
| join type=left username
[savedsearch RiskScore_ExcessiveLogins]
| join type=left username
[savedsearch RiskScore_AccuntPasswordChange]
…```Use fillnull to put 0 into risk scores, if nothing returns.
| eval RiskScore_ConnectionFromRiskyCountry=RiskScore_ConnectionFromRiskyCountry*0.75
| eval RiskScore_ExcessiveLogins=RiskScore_ExcessiveLogins*1.5
| eval RiskScore_AccuntPasswordChange=RiskScore_AccuntPasswordChange*.90
| eval Total_Risk=RiskScore_ConnectionFromRiskyCountry+RiskScore_AccuntPasswordChange+
RiskScore_AccuntPasswordChange | table Username, Total_Risk…
…```Use lookup to avoid hard coded weights and use machine learning to update weights
One thing to note is that the weights are hard coded and I suggest using lookup files to keep them independent of the search. If you are a data scientist, you may want to use baselines and statistics to change the weights frequently to match existing conditions using machine learning approaches. Dynamic weights are beyond the scope of this article.
The other thing you notice is that a left outer join is used to do the grouping. It is usually suggested to avoid joins and use the stats command for grouping in SPL.
Having said that Splunker Aalok Mehta, an enterprise architect, suggested to change the join command to a series of appends to concatenate the results of the saved searches and then use the Splunk stats command to group by username. This would be a performance improvement. Another performance improvement would be to use summary indexes for a summary that is going to be looked back in time (i.e., money laundering use cases) or to use Splunk data models and data model acceleration for faster retrieval of data.
Since .conf, I have given this all some thought and came up with another idea. What if each saved search runs on a scattered schedule (not all at once but with a few minute gaps) as account takeover is not a near real-time event until it happens and stores the risk score for each username in a new summary risk index using the collect command. Splunk can then run another scheduled search that uses a N minute window using the streamstats command to group all risk scores by username from the risk index and fire an alert if the score is above a threshold. Tweaking this approach and architecting it properly for the use case domain (not just account takeover) is a suggested exercise.
Although the ideas above are better than using singular rules, they leave some things to be desired. For one thing, firing an alert may not be the only action. What if there was a framework that does what was described above, but also handles case management, sequencing of events, management of alerts, context of what the user did (not just the Risk Score, but other things such what country was used in the wire transfer and how much, etc.), mapping of rules to security frameworks such as Mitre and kill-chain and a platform for collecting notable events?
This is called Risk Based Alerting (RBA) in Splunk Enterprise Security (ES), which is the next level of sophistication in detecting financial crime. RBA ships out of the box with Splunk ES and the desired features that were just listed come with it. What this does is collect notable events within a risk index, apply some meta-data context, and allow for alerts when alert thresholds are met. The alerts themselves can be sent to a SOAR product such as Splunk SOAR to create automated playbooks to respond to the situation. A response could be to block the inbound IP, lock out the account, notify the account holder and notify all involved company parties to analyze the case.
Here’s RBA in a nutshell:
To get a customer started with the account takeover and account abuse use cases with RBA, Splunker Gleb Esman and former Splunker Andrew Morris have created Splunk Fraud Analytics on Splunkbase, which is free to download. This will jumpstart your efforts with a data model, rules and integration with RBA for these use cases.
We have now moved from simple rules, to compound rules, to rules with risk scores, to risk based alerting. Depending on where one is in the financial crime detection journey, each step will get the Financial Crime department further along in their posture.
Since account takeover and money laundering have been mentioned a few times now, the question of how one knows who is the leader of a fraud ring comes to mind. The stats command in Splunk can help by finding out who is doing the most transactions to the most number of accounts in a time period.
|stats sum(amount) as amount count by From_Account, To_Account|sort - count
This looks easy, but it does not give us a graph where A connects to B and B connects to C and so forth. Also, a number for connections may miss many edge cases when there are a million accounts and a few million transactions involved.
Splunker and lead data scientist, Philipp Drieger, came up with an idea to use directed graphs to find possible connections between accounts or users using a technique called eigenvector centrality to find the most connected nodes that may lead to fraud ring leaders. This is highlighted in this picture below with the large dot in the bottom center of the picture.
The Splunk search can still be used to find from and to accounts grouped by sender and receiver, but the graph is what does the work. It uses an app called, 3D Network Topology Visualization, that can be freely downloaded from Splunkbase to implement the use case. For details on how this is done for fraud ring detection, please download the free e-book, Bringing the Future Forward, and read the chapter on detecting financial crime using network topology graphs.
What happened in Las Vegas stayed in Las Vegas so all I could tell you were just the facts on different ways to detect financial crime using Splunk products.
The advantages of using any time series data at scale and quickly pivoting with rules, risk scores and risk based alerting should definitely be a welcome addition to an organization’s financial crime detection posture. I hope these techniques can be visited and adopted for your organization regardless of industry.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.