- About the Project
- [Datasets] (#datasets)
- Screenshots
- Tech Stack
- Features
- Color Reference
- Environment Variables
- Getting Started
- Usage
- Roadmap
- Contributing
- FAQ
- License
- Contact
- Acknowledgements
I have used this project using CRNN+Ctc loss. The OCR project has two languages: English and Devanagari. It uses Chain approximation and Otsu method for edge detection, and makes predictions based on the detected edges.
For the English language model, I used the IAM English handwritten dataset. This is a widely recognized dataset for training and evaluating handwritten text recognition systems. To train the Devanagari model, I manually collected and labeled a dataset of Bhagwat Gita scriptures. Bhagwat Gita is a sacred Hindu text written in the Devanagari, Hindi with some sort of sanskrit script, providing a challenging and domain-specific dataset for the Devanagari OCR task.
Modeling stacks
Database
Client
DevOps
- OCR model develop from ownselves instead of using Others
- Detect, Extract textual content from Image and export in desired format
- Supports Two language script, English and Devanagari
Color | Hex |
---|---|
Primary Color | |
Secondary Color | |
Accent Color | |
Text Color |
To run this project, you will need to add the following environment variables to your .env file
Cloudinery API_KEY for storing Uploaded image
Your email host password for recovering forgot password
This project uses Docker for containerization. Make sure you have Docker installed on your machine. If not, you can download it from here.
Clone the project
git clone https://github.com/AnisH1427/Multilingual-OCR-FYP.git
Go to the project directory
cd Multilingual-OCR-FYP
Install pipenv if you haven't already
pip install pipenv
Install dependencies
pipenv install
Activate the pipenv shell
pipenv shell
Start the server
python manage.py runserver
Build the Docker image
docker build -t your-image-name .
Check the Docker images
docker images
Run the Docker container
docker run -d -p 8080:80 your-image-name
To deploy this project, you can use the Docker container you built in the previous step.
To run tests, use the following command
python manage.py test
- Data Acquisition
- Research
- Design Architecture
- Test with Different Hyperparameters
- Keep Training and Improving
- Design REST API
- Design User Interface
- Backend Setup
- Deploy Model
- Performance Monitoring Using Tensorboard
This project is Solely Contributed by MySelf as Final Year Project
Please read the Code of Conduct
-
What are the limitation in Current OCR so that Intelligent OCR is still the topic of research?
- Answer 1
-
Why OCR sysytem dont have different level of Performance in different languages?
- Answer 2
-
What can be done to improve the existing OCR systems?
- Answer 3
Distributed under MIT license
Your Name - @linkedin - anishkhatioda@outlook.com, anishkhatioda@gmail.com
Project Link: (https://github.com/AnisH1427/Multilingual-OCR-FYP)
Use this section to mention useful resources and libraries that you have used in your projects.
- [Biru Shrestha] - Project Supervisor
- [Uttam Acharya] - Reader