Encoder-Decoder Model for Automatic Video Captioning Using Yolo Algorithm

Hanan Nasser Alkalouti; Mayada Ahmed AL Masre

doi:10.1109/IEMTRONICS52119.2021.9422600

Back

Conference proceeding

Encoder-Decoder Model for Automatic Video Captioning Using Yolo Algorithm

Hanan Nasser Alkalouti and Mayada Ahmed AL Masre

2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp.1-4

21/04/2021

DOI: https://doi.org/10.1109/IEMTRONICS52119.2021.9422600

Abstract

Deep learning

Image recognition

Linguistics

Mechatronics

Memory management

Natural Language Processing (NLP)

Object detection

Video captioning

Visualization

You Only Look Once (YOLO)

Humans can use informed visual perception to generate sentences by bridging the gap between the recognition of visual features (images) and linguistic expression (words) describing these images. Videos are an example of visual perception; humans can describe the content of the video in meaningful sentences based on understanding their contents as a caption for the video. However, automating the video caption process is a challenging task as it confronts the model with two problems are: object detection and generating a sentence. This research aims to develop a model that automates video captioning based on Encoder-Decoder using a deep learning algorithm following these two steps. Firstly, using the KATNA model to select the most significant frames from the video and remove redundant ones. Secondly, combining the two deep learning algorithms YOLO and LSTM. The You Only Look Once (YOLO) algorithm recognizes objects in the video frames and the Long Short-Term Memory (LSTM) algorithm generates the video caption. The proposed model describes the video's content in a meaningful sentence and it shows good accuracy and efficiency, it applies YOLO on the MSVD dataset unlike other video captions using other deep learning techniques.

Metrics

1 Record Views

Details

Title: Encoder-Decoder Model for Automatic Video Captioning Using Yolo Algorithm
Creators - without role: Hanan Nasser Alkalouti - Faculty
Mayada Ahmed AL Masre - Faculty
Publication Details: 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp.1-4
Publisher: IEEE
Identifiers: 9935471408331
Academic Unit: King Abdulaziz University
Language: English
Resource Type: Conference proceeding