More results...

Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages

How Delta Capita helped a Global Investment Bank to Remediate Complex Client Datasets using Machine Learning

Share this article

The financial services industry has witnessed considerable hype around Machine Learning for several years now. However, a quick Google search will confirm that there are very few concrete examples of it being put in practice in large institutions and delivering tangible results.

We at Delta Capita strongly believe that Machine Learning, when applied correctly, can add significant value in financial services across multiple functions with high returns on investment. However, in our experience, the quality of data collected, stored, and utilised is equally as important as the Machine Learning model itself.

This article focuses on an age-old problem: unresolved and duplicative data sets. Banks with large datasets and multiple data warehouses still find it difficult to reconcile and have one standard dataset. Through automated validation, cleansing, and authentication of key data sets using Machine Learning to improve data quality, our clients can satisfy auditors while also streamlining processes to free up valuable resource time currently spent in chasing down the right versions of data.

De-Duplication of Client Data: A Delta Capita use case

Financial Institutions’ clients can be represented in multiple systems with different points of contacts (both internally and externally). It is difficult to find an exact match for many clients given:

  • Variations in spelling of their legal name or common abbreviations used
  • Hierarchies between parent and child companies being incorrectly mapped


The discrepancy posed multiple risks including:

  • Dealing with the incorrect counterparty
  • Failing to track client dealings leading to audit failures
  • Overspending on regulatory activities such as KYC, which further result in regulatory fines

Delta Capita were approached by our client to implement a solution for this problem.
A Machine Learning model was developed by a dedicated team of Data Scientists alongside the client’s data engineering team. Training data for this model was generated from miniature automated systems developed by the team and deployed for client facing stakeholders to populate and authorise.

How did we do it?

Our Data Science team would subsequently train this model and package it in a technology agnostic way, supporting the duplicate identification tool to be run in myriad ways:

  • A simple standalone GUI
  • A module that can be implemented by core organisational systems
  • A tool which functions from within standard spreadsheet software

As a result, onboarding and KYC teams were assured of dealing with a clean client dataset and able to focus on the core tenets of their work vs. spending time ensuring the accuracy of the data.

Benefits for the Client

How Delta Capita can help you

Reconciling large complex datasets may seem like a daunting hurdle but is an essential requirement for every business. Please reach out to see how Delta Capita can apply our expertise to support you remediate your data effectively, saving money, valuable resource time and reducing regulatory risk.

We additionally offer a variety of Data Analytics and Data Science solutions to the finance industry. To learn more about our service offerings, enquire here.

This article was co-authored by James Stott, Assistant Manager.