Initializing a Predictive Model for Increasing Return Rates in Debt Collection

Our client – a large service provider for debt collection – required a reliable method to predict the probability of incoming debt payments and defaulters. In a two-stage-approach, niologic implemented a predictive solution. This helped the client identify debtors who are more likely to pay than others and to plan their actions accordingly.


The client focusses on debt collection and differentiates between three types of debtors. Those who are likely to pay the full debt, those who pay the debt partially or those who do not pay at all. Furthermore, their work consists of identifying debts, which can be traced back to miscommunication. To meet the client’s demands, niologic set up a multi-stage predictive model based on customer data and publicly available open data, for identifying debtors within the given categories and predict the probability of payment and defaulters.


First, niologic used Google Cloud Dataflow to prepare the data and Cloud Data Loss Prevention to remove Personally Identifiable Information (PII) and set up a data warehouse (DWH) in Google BigQuery. After that, biases were removed using geographical information enriched with the BigQuery OSM public data set. Subsequently, first intuitions for feature engineering were modeled as prototypes with BigQuery ML and AutoML. For production, niologic decided to use a two-stage classification model – an ensemble approach with gradient boosting on decision trees. To measure the quality of classifications, niologic used f1-score and Area under Curve (AUC). Following this, our team deployed model inferencing using Google Cloud Dataflow. Finally, the batch prediction was stored in the BigQuery DWH.

Results and Customer Value

Niologic successfully implemented a predictive model tailored towards the client’s demands. The client is now able to predict the probability of debt payments, which allows them to prioritize debtors, who are likely to pay. The solution further enables the client to approach debtors who are less likely to pay. This is done by providing a prediction for the most efficient channel for starting a dialogue. Additionally, the solution provides insight into cases which exist solely due to miscommunication, or lack thereof, from the creditors. Consequently, corresponding debt can be cleared swiftly once such cases are identified.