Skip to content

Python ETL pineline craws data from Weather Forecast, transforms and loads to MySQL

Notifications You must be signed in to change notification settings

bijeck/ETL-WeatherForecast

Repository files navigation

ETL Weather Forecast

CI Coverage Status Python 3.6

Introduction

Python ETL Pineline craws data from Weather Forecast, transforms and loads to MySQL.

Authors: Bijeck


Technicals

  • Mock
  • MySQL
  • CI Python Code Validation
  • Dimensional Model
  • Pytest

Requirements

Project uses a number of open source projects to work properly:

  • MySQL - For run sql query and store data
  • Python - Main programming language that lets project run effectively.
  • MySQL Workbench - Manipulate with database, and show data

You should sign up your account in RapiAPI and subcribe to Weather Map API.


Project Folder

  • mock: contains mock data for testing
  • src : contains source files
  • src/etl : contains etl files
  • tests : contains tests files
  • tests/etl : contains etl tests files
  • config.json: contains configuration for MySQL server and X-RapidAPI-Key from Weather Map API
  • requirements.txt: list python requirement packages
  • .github/workflows/python-app.yml: file for run CI in github
  • database.sql: database script
  • weather_schema.png: database weather schema

Create Enviroment

Be Sure you have Virtulenv installed if not running below:

pip install virtualenv

After unzip the project, create a virtual environment with the following:

cd ETL_SuMP

virtualenv venv

Then active the virtual environment and install the packages:

# For Mac or Linux
source venv/bin/activate

# For windows
venv\Scripts\activate.bat

Installation

Install python packages to run project effectively:

pip install -r requirements.txt

Configuration

Configure your MySQL server in config.json:

Key Value
host localhost
user root
password yourpassword
database databasename

Configure your API-key from Weather Map API in config.json to able run appication:

Key Value
X-RapidAPI-Key key

Run Project

Create database and table:

python src\db.py

Run project:

python src\main.py

Enter your location to fetch data:

Enter your location: london

Your location 's data will countinue get after 30 seconds. You can terminate the project by press in your keyboard:

Note: You can terminate the project by press Ctrl + C


Run Test

Run test:

pytest -v

Run test with keywords( Examples: get,extract,transform,error):

pytest -k keywords -v

Run to see coverage all project results:

coverage run -m pytest

Slow Changing Dimensions

SCD Type 1

  • Apply in city_dim table.
  • Replace attribute from old record by new record with same city_id

SCD Type 2

  • Apply in weather_fact table.
  • Record have current_flag column to monitor the current weather of city.
  • When new weather datas is inserted, its current_flag will be Y and old record will be N. So we can keep the historical weather data of a city.

Note

You can use database.sql file to create database and it contains data for you.

Note: Rename database name in the file with your prefer name.


About

Python ETL pineline craws data from Weather Forecast, transforms and loads to MySQL

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages