Know Your Customer Again Revisited

By Nimish Doshi

At the end of last year, I wrote about using Splunk to monitor the Know Your Customer (KYC) use case that is a regulation in most Financial Services Institutions in many countries. The last part of the regulation states that continuous monitoring of your customers in terms of their interactions and transactions needs to take place.

In any bank, there are many types of transactions covering various things such as core banking, ATM, wire transfers, credit card use, payments, etc. Every application involved in these activities produces its own time services log data that is used for troubleshooting, security tracking, and analytics. Let’s revisit the example I presented last year. Suppose we are only monitoring a core banking feature for deposits and withdrawals for each customer. The simplest possible representation of this can be given with this example table, which is from the previous KYC blog.

Timestamp	AccountID	amount
11/2/2022 5:06:30	123	50
11/2/2022 5:06:30	456	6345
11/2/2022 5:06:30	123	53
11/2/2022 5:06:30	456	4353
11/14/2022 9:46:30	123	51
11/14/2022 9:46:30	456	6345

What was suggested last year is to use the Splunk stats command to find the average amount per account ID for every entry and then find any account that is more than N standard deviations from the average of the entity itself. For instance, if account ID 123 usually has an amount around 50 and then suddenly transacts 10,000, this would be an outlier that would easily be found. Yes, we can do this exercise on paper, but with a million accounts and monitoring each account separately requires continuous monitoring. We can then collect the outliers per account in a risk index and score them accordingly for further analysis.

Easily Operationalizing the Approach

Everything I suggested above still applies, but we recognize that not everyone knows Splunk Processing Language (SPL) or how to effectively collect this generated data per entity into a Splunk index.

Fortunately, Splunkers Rupert Truman and Josh Cowling created a free Splunkbase application called the Splunk App for Behavior Profiling, which can automate the KYC use case as long as we have the data for each functional banking domain. To continue the discussion with our example, let’s use their app, which is web driven. The only SPL I’ll use is to search for all events for a given sourcetype. My data is fictitious and several years old, but it still illustrates the point.

In the web page, after searching for the data within a time range, we pick a field to group by, which in this case is the unique customer name and the field that is going to be monitored for outliers, which is the amount field here. Sample results for the search and fields in question are shown automatically by the web page to continue.

Next, we pick a statistical function for the amount field (average) and split it by each unique customer. We can also do the average in time span buckets such as every hour or day.

Finally, we save this as a rule to collect the data to find the average amount per customer over a given time period as a scheduled search.

After the data is automatically collected within a summary index, we can use the web interface workflow for the indicators to score for standard deviation outliers, which go to a scoring index to stack rank them. This automation can be done for each functional domain in the FSI world such as ATM, credit cards, payments, wire transfers, etc., which makes continuous monitoring an easier task. The app also provides screens to drill down and investigate any particular entity, which is the customer in our case. There is even a review section to mark if an entity’s risk scores have been reviewed making this useful for compliance checks for review.

This part of KYC is set up and ready to go thanks to this app.

Machine Learning

Rupert and Josh’s app also has screens for using machine learning (e.g., probability density function) to find outliers within all entities, without having to learn in depth data science. The question may be asked, why not use machine learning to find out anomalies within the set of transactions for each customer? This is a matter of practicality because the way machine learning typically works is that it builds a model for a dataset to apply for future data. Building a million models for a million customers is probably an overkill. A more maintainable approach would be to cluster each customer by a segment such as transaction amounts. Some customers will be clustered as average amounts around 50. Others may be clustered with 500. Some may even be clustered with 500,000 as their typical amounts. Now, one can build a model per cluster and find outliers per cluster rather than for individual customers. This makes it scale better and an order of magnitude more manageable.

Conclusion

The KYC use case is an important banking regulation and continuous monitoring is the most vital part of it. What was discussed was an easier approach to operationalize monitoring each customer’s transactions, and hence their behavior for outliers. The Splunk App for Behavior Profiling can be used for a variety of FSI use cases where one is looking for anomalies within any set of entities or for each entity against themselves as well.

Nimish Doshi

Nimish is Director, Technical Advisory for Industry Solutions providing strategic, prescriptive, and technical perspectives to Splunk's largest customers, particularly in the Financial Services Industry. He has been an active author of Splunk blog entries and Splunkbase apps for a number of years.

Industries 2 Min Read

Modernizing the Mission: How Public Organizations Are Transforming to Better Serve Citizens

Discover how the Splunk Data-to-Everything Platform has enabled public organizations to advance their cloud and modernization strategies to keep up with citizens’ evolving needs and expectations.

Industries 2 Min Read

Why Cybersecurity Depends on the CDM Integration Layer

When you take a close look at the Continuous Diagnostics and Mitigation (CDM) function at the heart of a successful cybersecurity program, you quickly realize that it all depends on integration.

Industries 4 Min Read

UK Telecommunications Security Act 2021: 3 Documents From The Regulators Every Telco Executive Should Read

Learn more about the Telecommunications (Security) Act 2021 (TSA), why it's needed, why adhering to the regulation can be challenging and which documents every Telco exec needs to read to ensure compliance.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.