Skip to content

PyTorch Image Caption case with Bidirectional LSTM and Flickr8k Dataset

License

Notifications You must be signed in to change notification settings

nunenuh/imgcap.pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Caption With PyTorch

This repository is still under construction.

This repository is created to show how to make neural network using pytorch to generate a caption from an image. The dataset that I use in this repository is Flickr8k and Flikcr30k Image Caption dataset. The model is divided into encoder and decoder to make it more clear to read the code.

Image Source : Encoder Decoder Image from Udacity Computer Vision Nanodegree Project

The image on the top is the ilustration of the network, but not very similar to what I do in this project, that image is just only the illustration of what the network do.

Project Structure

asda

Usage

  1. Clone the repository

  2. Download the dataset

  3. Train the model

  4. Test the model

Flickr Dataset Source

The dataset that I use for this repository can be downloaded from this dataset repository :

Flickr8k Dataset : https://www.kaggle.com/nunenuh/flickr8k
Flickr30k Dataset : https://www.kaggle.com/nunenuh/flickr30k

This dataset is in my data repository at kaggle, I change the dataset from original dataset to match what I need in this repository

Pretrained Model

Code Originally from

https://github.com/aladdinpersson/Machine-Learning-Collection/tree/master/ML/Pytorch/more_advanced/image_captioning

https://github.com/yunjey/pytorch-tutorial/tree/master/tutorials/03-advanced/image_captioning

About

PyTorch Image Caption case with Bidirectional LSTM and Flickr8k Dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages