Skip to content

The objective of this project is to extract textual data from articles provided in given URLs and perform text analysis to compute various metrics.

Notifications You must be signed in to change notification settings

shubhamparmar1/ArticleStatsInsight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

ArticleStatsInsight

Objective

The objective of this project is to extract textual data from articles provided in given URLs and perform text analysis to compute various metrics. The metrics include sentiment scores, readability scores, and other textual statistics.

Data Extraction

Input

The URLs of the articles are provided in the Input.xlsx file. For each URL, the program extracts the article text and saves it in a text file named after the URL_ID.

Extraction Process

  • Only the article title and text are extracted.

Data Analysis

For each extracted text, perform textual analysis to compute the variables as specified in the Output Data Structure.xlsx file.

Dependencies:

 BeautifulSoup
 NLTK
 Pandas
 Requests

About

The objective of this project is to extract textual data from articles provided in given URLs and perform text analysis to compute various metrics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages