Cervical Cancer Risk Classification

Challenge

Part 1: EDA Extract the data from the csv file and use some techniques to gain insights.
Part 2: ETL Clean the data and prepare it to be loaded in the next step.
Part 3: Modeling Create some Machine Learning models to predict the classification, choose the best model in your opinion, justify your election and make it as prepared as you can to be ready for deployment.
Part 4: Interpret Explain your results using metrics and visualizations techniques and write a short post in which you try to explain the project and your work to a non-technical audience.

Part 1-3.1: EDA, ETL and Modelling: Find the solution in the notebook cervical_cancer.ipynb
Part 3.2: Model: The best model(s) from the notebook is saved in model.py
Part 3.2: Deployment: Deployment with FAST API - Find the solution in api.py. The model is saved in model.py
Part 4: The report is in REPORT.md file.

Install the requirements: pip install -r requirements.txt
Run the API: uvicorn api:app --reload or to debug uvicorn api:app --log-level debug
Run the test: python api_test.py (there is a bug with the dataset labels and the api data validation, a replacement is needed " " for "_" in the models feature keys)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
visualization		visualization
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
REPORT.md		REPORT.md
__init__.py		__init__.py
api.py		api.py
model.py		model.py
requirements.txt		requirements.txt
risk_cervical_cancer.ipynb		risk_cervical_cancer.ipynb
selected_Biopsy_model.joblib		selected_Biopsy_model.joblib
selected_Citology_model.joblib		selected_Citology_model.joblib
selected_Hinselmann_model.joblib		selected_Hinselmann_model.joblib
selected_Schiller_model.joblib		selected_Schiller_model.joblib
selected_any_positive_model.joblib		selected_any_positive_model.joblib
test_api.py		test_api.py