Skip to content

Extract information from the internet using Web Scraping in order to acquire datasets describing Rwanda's popularity. Perform an exploratory data analysis in order to visualize Rwanda's popularity growth.

Notifications You must be signed in to change notification settings

Molo-M/Rwanda-s-Fame-Analysis-

Repository files navigation

Rwanda's Fame (Analysis)

Extract information from the internet using Web Scraping in order to acquire datasets describing Rwanda's popularity. Perform an exploratory data analysis in order to visualize Rwanda's popularity growth.

Tasks:

This project is made up of 3 parts. Web scraping, analysis, and presentation. Using web scraping techniques, I shall scrape social media, news stations and travel review websites in order to acquire datasets which will be used for the analysis stage. I will analyze the data using Jupyter Notebooks and finally present it using powerpoint.

Part 1: Web Scraping

Scrape BBC:

With the help of Beautiful Soup, scrape all mentions of 'Rwanda' from BBC webbsite search tab.

Dataset acquired: BBC_Data

Webscraping Program: Rwanda BBC Webscraping.py

Scrape Aljazeera:

With the help of Beautiful Soup, scrape all mentions of 'Rwanda' from Aljazeera webbsite search tab.

Dataset acquired: Aljazeera Data

Webscraping Program: Rwanda Aljazeera Webscraping.py

Scrape Euro News:

With the help of Beautiful Soup, scrape all mentions of 'Rwanda' from Euro News webbsite search tab.

Dataset acquired: Euro News Data

Webscraping Program: Rwanda_euro_news Webscraping.py

Scrape Reddit:

We will get all the information from the subreddit 'r/Rwanda' using the reddit API. This is faster than manually scraping the subreddit directly.

Dataset acquired: Reddit Data

Webscraping Program: Rwanda Reddit Scrape.py

Scrape Booking.com:

With the help of Selenium, we shall scrape information on all the stays in Rwanda from Booking.com

Dataset acquired: Booking.com Data

Webscraping Program: booking.com scraping.py

Scrape AirBnB:

With the help of Selenium, we shall scrape information on all the stays in Rwanda from the AirBnB website.

Dataset acquired: AirBnB Data

Webscraping Program: airbnb scraper.py

Scrape TripAdvisor:

With the help of Selenium, we shall scrape information on all the stays in Rwanda from the TripAdvisor website.

Dataset acquired: TripAdvisor Data

Webscraping Program: Tripadvisor Rwanda stays.py

Part 2: Exploratory Data Analysis

Perform an initial analysis on the data collected so far. This will help us to understand the data collected and prepare us for the actual exploratory analysis.

Notebook: Analysis

NB: This project is still ongoing.

About

Extract information from the internet using Web Scraping in order to acquire datasets describing Rwanda's popularity. Perform an exploratory data analysis in order to visualize Rwanda's popularity growth.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published