Sequence-to-Sequence Image Caption Generator

Rehab Alahmadi; Chung Hyuk Park; James Hahn

doi:10.1117/12.2523174

Back

Conference proceeding

Sequence-to-Sequence Image Caption Generator

Rehab Alahmadi, Chung Hyuk Park and James Hahn

ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), Vol.11041, pp.110410C-110410C-7

Proceedings of SPIE

01/01/2019

DOI: https://doi.org/10.1117/12.2523174

Abstract

Computer Science

Computer Science, Artificial Intelligence

Optics

Physical Sciences

Science & Technology

Technology

Recently, image captioning has received much attention from the artificial-intelligent (AI) research community. Most of the current works follow the encoder-decoder machine translation model to automatically generate captions for images. However, most of these works used Convolutional Neural Network (CNN) as an image encoder and Recurrent Neural Network (RNN) as a decoder to generate the caption. In this paper, we propose a sequence-to-sequence model that uses RNN as an image encoder that follows the encoder-decoder machine translation model, such that the input to the model is a sequence of images that represents the objects in the image. These objects are ordered based on their order in the captions. We demonstrate the results of the model on Flickr30K dataset and compare the results with the state-ofthe- art methods that use the same dataset. The proposed model outperformed the state-of-the-art methods on all metrics.

Metrics

1 Record Views

Details

Title: Sequence-to-Sequence Image Caption Generator
Creators - without role: Rehab Alahmadi - George Washington University
Chung Hyuk Park - George Washington University
James Hahn - George Washington University
Contributors - without role: A Verikas
D P Nikolaev
P Radeva
J Zhou
Publication Details: ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), Vol.11041, pp.110410C-110410C-7
Series: Proceedings of SPIE
Publisher: Spie-Int Soc Optical Engineering
Number of pages: 7
Identifiers: 9930597608331
Academic Unit: Taibah University
Language: English
Resource Type: Conference proceeding