Data analytics and Big Data are the flavor of the moment; technologies that promise to let banks do business more efficiently and effectively. And it’s true. Data analytics and Big Data are powerful tools that can allow banks to make smarter decisions, sell smarter to their customers, and even protect their customers from fraud. But without the right kind of data store strategy, that power quickly diminishes.
Banks collect a lot of data that can help them, particularly in the all-important battle against fraud. This data can be used to build up a detailed picture of each customer or employee and help to sort suspicious transactions from the genuine. It includes customer, employee transactional and channel data, as well as metadata such as screen resolution, location, time of activity and device. But much of it is stored on disparate, unconnected systems. The problem is how to arrange data store and access to all of it for analysis at the appropriate time so that fraud attempts can be spotted and blocked before they are concluded.
NetGuardians CTO explains why many Big Data projects fail and how to make them work for banking fraud prevention.
Capacity on its own is no longer a problem – the big Tier 1 banks can afford multiple data warehouses, while the cloud makes affordable, scalable storage a reality for the rest. The challenge lies in speedy access at volume – something social-media companies have cracked, and banks would do well to copy.
The window for stopping a fraudulent transaction is tiny – less than a second – so storing key data across a number of slow-to-access databases is not an option. Add to this the incredibly high number of transactions some banks must screen – perhaps 10m a week for the biggest – and banks need carefully constructed, distributed, parallel, interconnected solutions that store the right data in the right place to maximize speed and reduce friction.
For example, customer data such as address and date of birth are not relevant to most transactions, and do not need super-swift access; data about screen resolution and transaction history are, and should be accessible fast. While not all the data is as simple to categorize as these examples, algorithms can help.
Data needs to be mined from multiple sources, combined, enriched. The result gets indexed and stored for fast access using Elasticsearch, which scales horizontally to handle massive amounts of transaction events per second by distributing the workload across a cluster.
An anti-fraud solution should use these combined data sets in analysis, filtering and use the data they contain for machine learning, visualization and training models to sort normal customer behavior from the suspicious.
By using incoming data to constantly evolve the model, an anti-fraud solution will be always ready to compare each new transaction against the model. The comparison is ultra-fast because all the necessary data is handled by a distributed and scalable system.
This type of dynamic storage is preconfigured for the convenience of the end user.
It results in up to 83 percent fewer false alerts, and typically banks spend 93 percent less time investigating suspicious activity. They are also likely to stop more fraud before it happens.
So while data analytics and Big Data are enjoying their moment in the sun, without the appropriate data store solutions they will fail to deliver the promised benefits. It’s time storage, so long taken for granted, came out of the shade to join them.