The YouTube Data Harvesting and Warehousing project is dedicated to creating an intuitive Streamlit application that harnesses the Google API's power to extract valuable information from YouTube channels. This extracted data is then stored in a MongoDB database and seamlessly migrated to a MySQL data warehouse, all within the same user-friendly Streamlit application.
- Key Technologies and Skills
- Usage
- Features
- Data Retrieval from the YouTube API
- Storing Data in MongoDB
- Data Migration to a MySQL Data Warehouse
- Contribution Guidelines
- Contact Information
- Python Scripting
- Data Collection
- API Integration
- Streamlit
- Data Management using MongoDB and MySQL
To use this project, follow these steps:
- Install the required packages:
pip install -r requirements.txt
- Run the Streamlit app:
streamlit run app.py
- Open the app in your web browser. You can access it by opening a new tab and entering the following URL:
http://localhost:8501
- Retrieve data from the YouTube API, including channel information, playlists, videos, and comments.
- Store the collected data in a MongoDB database.
- Migrate the data to a MySQL data warehouse.
- Analyze and visualize data using Streamlit and other Python libraries.
- Perform queries on the MySQL data warehouse.
- Gain insights into channel metrics, video metrics, and top 5 data of the respective channel.
- Answer default 10 queries to provide immediate insights into the data.
Our project harnesses the power of the Google API to retrieve comprehensive data from YouTube channels. This data encompasses detailed information on channels, playlists, videos, and comments. By interacting with the Google API, we gather and consolidate this information into a structured format, laying the foundation for further data processing and analysis.
The retrieved data is securely stored in a MongoDB database with proper user authorization. In cases where the data already exists in the database, any subsequent attempts to insert the same data will automatically lead to data overwriting. This storage mechanism ensures efficient and updated data management and preservation, simplifying the process of handling the collected data.
Our application empowers users to transfer data from MongoDB to a MySQL data warehouse seamlessly. Users can choose the specific channel they want to migrate from the displayed list, enabling a tailored approach to data handling.The data migration process involves transforming and structuring the collected data, ensuring it aligns with the structured format of a MySQL database.Facilitating a seamless transition of data from MongoDB to MySQL for further analysis and insights.
Contributions to this project are highly encouraged. If you come across any challenges or have ideas for enhancements, we invite you to submit a pull request. Your input is valuable to us, and we appreciate your contributions.
Email: sec19ee048@sairamtap.edu.in LinkedIn: www.linkedin.com/in/priyanga070302
If you have any more questions or need further information, please don't hesitate to get in touch. We're here to help and answer any inquiries you may have.