NATE: A New Approach for Explainable Credit Scoring

Sample size: 150000 publication Evidence: high

Author Information

Author(s): Han Seongil, Jung Haemin

Primary Institution: School of Computing & Mathematical Sciences, University of London, Birkbeck College, London, United Kingdom

Hypothesis

The proposed NATE models will enhance classification performance by capturing non-linearity in imbalanced datasets while providing clear reasons for credit scoring predictions.

Conclusion

NATE significantly outperforms traditional logistic regression in credit risk classification, improving predictive performance and interpretability.

Supporting Evidence

NATE improves AUC by 19.33%, MCC by 71.56%, and F1 Score by 85.33% compared to logistic regression.
NATE enhances interpretability by providing insights into feature contributions.
SMOTE oversampling outperforms NearMiss undersampling in improving classification performance.

Takeaway

This study created a new method to help banks better understand who might pay back loans by using smart computer models that are easier to explain.

Methodology

The study used a dataset of 150,000 samples, applying oversampling and undersampling techniques to balance classes and employing tree-based ensemble models for classification.

Potential Biases

The use of SMOTE may introduce overlapping data points, potentially leading to overfitting.

Limitations

The study is limited to a single dataset, which may affect the generalizability of the findings.

Participant Demographics

The dataset includes demographic information, payment behavior, and delinquency data for credit applicants.

Digital Object Identifier (DOI)

10.1371/journal.pone.0316454

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication

Home

Previous Next