Sign in
Bi-Modal Transformer-Based Approach for Visual Question Answering in Remote Sensing Imagery
Journal article   Peer reviewed

Bi-Modal Transformer-Based Approach for Visual Question Answering in Remote Sensing Imagery

Yakoub Bazi, Mohamad Mahmoud Al Rahhal, Mohamed Lamine Mekhalfi, Mansour Abdulaziz Al Zuair and Farid Melgani
IEEE transactions on geoscience and remote sensing, Vol.60, pp.1-11
2022

Abstract

Co-attention Computer vision Feature extraction Head Remote sensing self-attention Task analysis Transformers vision-language models visual question answering (VQA) Visualization

Metrics

1 Record Views

Details