DeepSense loan-defaulting neural-net has success in Netherlands

DeepSense is targeting predictive modeling, machine vision and natural language processing (NLP) in its bid to become a major European force in AI. These are early days as the company only in 2018 spun out from Polish data networking software house CodiLime where it had been operating as the data science wing since 2015. As projects were gained and $6mn raised in funding, the firm spread its wings under Tomasz Kułakowski, former CEO of CodiLime, at the helm, with Robert Bogucki as CTO.

The firm now has now delivered projects in each of its core areas, which followed participation in several competitions in the case of predictive modeling. One was to predict the probability that cash loans will be repaid for the Netherlands based Home Credit Group, which lends mainly to people with little or no credit history, half of whose 15mn customers are in China.

This involved building a neural network from data sources such as timing of previous late payments, and rejections by external credit bureau, which are hard to capture manually because there are so many variants. DeepSense created a single vector, comprising different data sources for every month going back 8 years, to create a sparse user image fed into the neural network.

This exercise raised some interesting points relating to the weightings that should be assigned to different data sources. One point that emerged was that even clients who did default on their loans tended to score OK on most metrics of past behavior, such as past loans, past installments, past credit cards and current credit card balance.

The data analysis singled out those traits that were most common among defaulters, which tended to be a relatively large number of low indicators rather than one or two major issues. After all customers who had big problems in the past probably would not have been given a loan even by an institution specializing in those with low credit ratings.

The DeepSense solution was able to predict whether a lender would default on a loan with 80% AUC (Area Under Curve). This means there was an 80% probability that a randomly selected “defaulter” would be ranked by the model as more likely to default than a given non-defaulter.

Under computer vision, the firm’s highest profile project so far has been an application for measurement and analytics group Nielsen, to scan and sort information about particular retail products. The app scans product labels, identifying ingredients and grouping them in a table, effectively reducing researchers’ workload. Using a mixture of visual and textual information, the algorithms had to cope with poor quality and different packaging shapes, achieving just over 90% accuracy. Then for NLP, DeepSense has created software to identify propaganda and fake news on Twitter.

The company has been focusing so far on demonstrating its technology and gaining awareness through participation in challenges, with other high-profile partners including Google, Intel and Nvidia. The challenge now is to break off from these foundations and win significant contracts and get involved in longer term projects.