Autonomous AML: Hype? Or Hope?

Executive Summary

Banks and financial institutions are facing increasing – and daunting – scrutiny globally as anti-money laundering (AML) laws and regulations evolve and inexorably tighten worldwide. They have been pouring pots of money into surveillance systems and operations – for years – just to conform. Yet regulators routinely identify weak controls in compliance programs, and continue to levy huge penalties and sanctions. 

The continued challenge in AML operations for alert investigation today is the high number of false positives. As much as 90-95% of the alerts generated by traditional, parameter-based transaction monitoring systems are false positives. Rising volume and sophistication of fraudulent activity, and ever-increasing regulations are driving up such alerts. Vexingly, these are managed manually most of the time, which increases staff requirement and imposes additional operational costs. To boot, investigations are significantly manual in process and human judgment-dependent, so are exposed to human error. 

To overcome these problems, financial firms are considering new technology tools that address operational and cost challenges affecting the transaction monitoring alert investigation process. 

One of these is Machine Learning. IBM Watson has developed a system using advanced machine learning methodologies, which will automate the alert investigation to flag the false positives and bring efficiency to current AML investigations process. By using machine learning, it is possible to: 

  • Significantly reduce the amount of effort spent on alert investigation.
  • Help banks in identifying approximately 25-40% of false positives sans investigation.
  • Build a case management system on the alert pool to better address alert investigation based on an alert risk ranking 


Money laundering and fraud transactions are growing at a fast pace, and are valued at $1-2 trillion annually, accounting for 2-5% of global GDP (Source: Bloomberg). Till date, the world’s biggest banks have been fined $321 billion. Between 2012 and 2016, banks coughed up roughly $269 billion in fines. Source: Bloomberg 

In the same time period, investments by banks and financial institutions in AML surveillance systems and operations doubled to ~$8 billion. Sure, global financial institutions have made significant progress in their understanding of, and in implementing, AML solutions. However, with billions of transactions being executed across locations and businesses, they are struggling to differentiate between the risky and the legitimate ones. 

 Current State Challenges in handling AML transaction monitoring systems 

Banks have embraced AML technology to comply with tougher regulations and to avoid large fines. But they face other challenges such as large transaction data volumes (in billions) and required reporting formats. Then there is the variety, veracity and velocity of data. High volume and heavy transaction record data feeds come in different formats and are often of poor quality. Illegitimate records are very difficult to track, and money launderers are frequently changing transaction patterns to avoid detection. 

Banks intend to keep thresholds low for scenarios in transaction monitoring systems so that they don’t miss any false negative, or genuinely suspicious activity. This results in a substantial increase in the alerts generated. If investigation of these alerts are not handled efficiently, consequences could be multi-dimensional, including increased cost and resources, rejection of good customers, reputational damage, delays in investigation timeline, and, most importantly, lead to fines, penalties and sanctions. 

Inefficient AML investigation processes

AML systems are typically rules-based and generate alerts on numerous predefined business rubrics and scenarios, where the ultimate objective is akin to finding a needle in a haystack. These systems have several limitations, with the notable ones being: 

  1. They generate a massive amount of false positives (typically 90-95%) due to the lower thresholds set.
  2. Investigation of alerts is a labor-intensive process and requires a significant amount of time and cost to review and make a disposition.
  3. Failure to detect genuinely suspicious activity.
  4. The investigations conducted are significantly manual and dependent on investigator judgment, which leaves room for major human errors.
  5. They are based on static rule-based engines that require constant maintenance and governance, increasing the need for skilled staff 

Moreover, analysis of the alerts generated by an AML system is a time-consuming task and must be completed with a sufficient level of scrutiny to ensure compliance with existing governance processes. And then there is the major challenge of identification and removal of those alerts that are false positives. 

With billions of transactions flowing, the challenge is to prevent false-positive alerts and focus on the potentially risky ones. Thus, there is a need to reduce the cost and time for compliance and the transaction monitoring team, while increasing the robustness of surveillance and reporting. 

OmniLabs Advisory Partners had conducted research in 2018 which showed that developing systems using advanced machine learning methodologies significantly reduces the amount of effort spent on investigating the alerts manually, and help banks in identifying about 25- 40% of false positives sans investigation, and build a case management system on the alert pool to better address alert investigation based on an alert risk ranking. 


Machine learning models used to investigate alerts can either be supervised or unsupervised. Unsupervised learning draws inferences from datasets that are not labeled. If historical data pertaining to money laundering is scarce and unreliable, it would be advisable to use unsupervised learning techniques such as Clustering, Link Analysis, and Associative Learning. 

If enough historical data is available, however, then a model using supervised learning (such as Artificial Neural Networks, Random Forest, Support Vector Machines, and Logistic Regression) would be more useful. In supervised learning, inferences are drawn from labeled training data. 

OmniLabs has assessed the current offerings across industry wide providers. In evaluating the various providers, commonalities arose around the development of a supervised machine learning algorithm for clients where adequate historical data from banks is available for model development. 

 The following are the steps involved in developing the model: 

  • Historical alerts are labeled as true positive or false positive.
  • Transaction details are used as potential independent variables/features, along with all the tunable and static parameters with regard to each scenario.
  • Feature engineering will come up with appropriate, information-rich and significant variables.
  • Selection of an appropriate methodology after understanding the complexity, data, and merits and demerits of the classifier.
  • Development data is split into training, validation and testing datasets. Appropriate sampling and data balancing methods are used to indicate the complexity level of an alert.
  • Train the model using training data, automate the feature selection process, estimate the parameters of the models and assess over fitting by performing k-folds cross-validation.
  • Automate the entire process, while focusing on prioritizing alerts. For example, creating an Alert 

Risk Scoring

Risk scores can be used to define different levels of alerts risk (boundaries and number of risk levels depend on the bank’s risk appetite) 

The output of the model would be an Alert Risk Score (of 1 to 100) that would help optimize or accelerate investigations. For example, if the Alert Propensity Score is 12, then those alerts need not be investigated. 

In the methodology, the implementation of machine learning algorithms acts as a layer to recommend the alerts as either false positive or true positive. The machine learning layer uses the alerts generated in the current month to derive the output, which is trained using the historical alert data. The chart below shows the process flow: 

 Process flow

Broader impact of the methodology

Machine learning techniques have proven to be well-suited for identifying patterns and trends in large volumes of data. Our methodology has the following advantages: 


  • A robust and efficient system that investigates alerts and derives false positives with accuracy
  • Increases staff productivity, service levels and capacity by more than 35%.
  • 25%-40% reduction for an average alert investigation process globally.
  • Faster processing of alert investigation would, apart from saving time, reduce backlog 

Reduced Errors 

  • Eliminates manual monitoring, data collection and human errors significantly.
  • Takes operational efficiency to another level in compliance and risk management teams 

Cost Reduction 

  • Reduces operational cost by 30-35% across a bank globally.
  • Return on investment within 6-9 months 

Conclusion: Future State

As socio-political, and geo-political climates globally continue to change at such a rapid pace, fines, penalties, and sanctions pertaining to money laundering will continue to dominate global news despite decades of ever-tougher policies and regulations. Financial institutions will need to continuously explore innovative solutions and new approaches to meet the evolving AML requirements such as data analytics, improved investigator productivity, driving down operational costs, reduction of false positives, and detection of non-conventional behavior more accurately. 

Machine learning methods are one such innovative approach that can be utilized to exploit huge amounts of AML data, learn patterns in it, adapt to macro environments, behaviors, and regulations that rule-based traditional tools aren’t able to do. 

OmniLabs believes that IBM Watson has developed a machine learning-based model that gives a money laundering alert a risk score that’s useful to prioritize investigation queues for second level investigators to file suspicious activity reports. Our findings were that IBM’s model uses historical alerts and various transaction variables along with all the tunable and static parameters with regard to each scenario to come up with an actionable risk scoring. The IBM Watson model provided very encouraging and actionable results on various key performance metrics such as the cognitive supervised machine learning- based models which proved to show actionable results in reducing the efforts in investigation and achieving higher efficiency with a completely automated model that held up to both legal and regulatory scrutiny.

Banks and financial institutions need to develop AML systems that are based on machine learning approaches, and that’s where IBM Watson comes into the picture.