Image Captionbot for Assistive Technology

Safiya K. M., Dr. R. Pandian

Authors

Safiya K. M., Dr. R. Pandian

Abstract

Generating small descriptions from the image is a very difficult task because of the complexity of image features and the vastness of the language contexts. An image may contain a wide variety of information and thus extracting the context of the information contained in the image and generation of the sentence using that context is a very complex task. However, the task can help blind people to understand the surrounding without others assistance. Deep learning techniques have emerged as a new trend in programming and can be utilized to develop this kind of system. In the project, we will be using VGG16, one of the best CNN architectures for image classification and for extracting features from images. An embedding layer and LSTM will be used for text description. And these two networks will be combined to form an image caption generation network. Then we will train the model using data prepared from the flickr8k dataset. The trained model will be used to generate caption from new images and the generated caption will be converted to audio for helping the blind.

Image Captionbot for Assistive Technology

Authors

Abstract

Downloads

Published

Issue

Section

Downloads

Downloads

imp_links

Important Links

Indexing

Information