American Express Default Prediction

Team Member: Jarad Angel, Meredith Wang

Date: Aug - Sep 2022

Whether out at a restaurant or buying tickets to a concert, modern life counts on the convenience of a credit card to make daily purchases. It saves us from carrying large amounts of cash and also can advance a full purchase that can be paid over time. How do card issuers know we’ll pay back what we charge? That’s a complex problem with many existing solutions—and even more potential improvements, to be explored in this competition.

Credit default prediction is central to managing risk in a consumer lending business. Credit default prediction allows lenders to optimize lending decisions, which leads to a better customer experience and sound business economics. Current models exist to help manage risk. But it's possible to create better models that can outperform those currently in use.

Business Goals

▪️ Apply our machine learning skills to predict credit default.

▪️ Leverage an industrial scale data set to build a machine learning model that challenges the current model in production.

Timeline

▪️ May 25, 2022 - Start Date.

▪️ August 17, 2022 - Entry Deadline. You must accept the competition rules before this date in order to compete.

▪️ August 17, 2022 - Team Merger Deadline. This is the last day participants may join or merge teams.

▪️ August 24, 2022 - Final Submission Deadline.

Data Context

Training, validation, and testing datasets include time-series behavioral data and anonymized customer profile information.

Data Context

The objective of this competition is to predict the probability that a customer does not pay back their credit card balance amount in the future based on their monthly customer profile. The target binary variable is calculated by observing 18 months performance window after the latest credit card statement, and if the customer does not pay due amount in 120 days after their latest statement date it is considered a default event.

The dataset contains aggregated profile features for each customer at each statement date. Features are anonymized and normalized, and fall into the following general categories:

- D_* = Delinquency variables
- S_* = Spend variables
- P_* = Payment variables
- B_* = Balance variables
- R_* = Risk variables

Process

1️⃣ Data Acquisition

acqure.py

2️⃣ Data Preparation

Data Cleaning

3️⃣ Exploratory Analysis

4️⃣ Statistical Testing & Modeling

5️⃣ Modeling Evaluation

Steps to Reproduce

[x]
Clone the repo
[x]
[x]
[x]

Key Findings

▪️

Recommendations

▪️

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md
eda.ipynb		eda.ipynb
fx.py		fx.py
meredith_eda.ipynb		meredith_eda.ipynb
model.ipynb		model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

American Express Default Prediction

Business Goals

Timeline

Data Context

Data Context

Process

1️⃣ Data Acquisition

2️⃣ Data Preparation

3️⃣ Exploratory Analysis

4️⃣ Statistical Testing & Modeling

5️⃣ Modeling Evaluation

Steps to Reproduce

Key Findings

Recommendations

About

Releases

Packages

Contributors 2

Languages

m3redithw/american-express-default-prediction

Folders and files

Latest commit

History

Repository files navigation

American Express Default Prediction

Business Goals

Timeline

Data Context

Data Context

Process

1️⃣ Data Acquisition

2️⃣ Data Preparation

3️⃣ Exploratory Analysis

4️⃣ Statistical Testing & Modeling

5️⃣ Modeling Evaluation

Steps to Reproduce

Key Findings

Recommendations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages