Skip to content

abdullaabdukulov/my_first_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

Welcome to My First Scraper


Introduction

meme_scraping

Github trending's page

Technical specifications

Using python libraries requests and beautifulsoup4, return a CSV of the TOP 25 trending repositories from Github.

  1. Request (with request)
  2. Extract (with beautifulsoup4)
  3. Transform
  4. Format

Part 0: Request Write a function prototyped: def request_github_trending(url) it will return the result of Request.

Part 1: Extract Write a function prototyped: def extract(page) to find_all instances of HTML code of repository rows and return it. You should use BeautifulSoup. :-)

Part 2: Transform Write a function prototyped: def transform(html_repos) taking an array of all the instances of HTML code of the repository row. It will return an array of hash following this format: [{'developer': NAME, 'repository_name': REPOS_NAME, 'nbr_stars': NBR_STARS}, ...]

Part 3: Format Write a function prototyped: def format(repositories_data) taking a repository array of hash and transforming it and returning it into a CSV string. Each column will be separated by , and each line by \n The columns will be Developer,Repository Name,Number of Stars

image

Demo version

About

Project is very interesting

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages