Skip to content

Linear Regression on House Price Regression Advanced Techniques dataset from Kaggle.

Notifications You must be signed in to change notification settings

Abuzariii/Kaggle-House-Prices-Advanced-Regression-Techniques

Repository files navigation

Kaggle-House-Prices-Advanced-Regression-Techniques

I went through all the fundamental steps you need to preprocess a large dataset and then used the Linear Regression model.

Deep Neural Networks were used to beat the Mean Absoulte Error of the baseline model.

The dataset can be downloaded here.

Use the Pandas Profiling notebook only if you want to learn it, else use the "01_Linear_Regression.ipynb" file.

This notebook is divided into 5 portions:

1. Pandas Profiling :

I used the built in Pandas Profiling to generate a profiling report in Colab Notebook.

2. Feature Selection :

Feature selection was done based on missing values, feature correlation and Backward Elimination. All these methods are described briefly.

3. Data Preprocessing :

Missing values were filled using mean and categorical columns were coded using cat.code.

4. Visualiztion :

Just a trivial visualization of the value distribution among all the columns.

5. Modeling :

A Linear Regression model was used to fit the preprocessed data and then then I used Mean Absolute Error and Mean Squared Error as evaluation methods.

Spoiler Alert: Mean Absolute Error stood at 21k against an average Sale Price value of 180k.

Releases

No releases published

Packages