Downloads the subtitles of IMDB Top 100 movies.
Lists all the words in the subtitles and their counts in an Excel file. Filters stopwords. (e.g. "the", "a/an", "him",
etc.)
Generates word/count charts based on the data obtained.
- Python >= 3.6
- Run
pip install -r requirements.txt
for other dependencies.
Run python src/subtitle_analyzer.py
in the project directory.
Generated Excel file and Chart images will be in the data_output file under project directory.