Information fusion in visual question answering: A Survey

Dongxiang Zhang; Rui Cao; Sai Wu

doi:https://doi.org/10.1016/j.inffus.2019.03.005

doi.org/10.1016/j.inffus.2019.03.005

Information fusion in visual question answering: A Survey

,

,

Information Fusion18.60

Volume: 52, Pages: 268 - 280

Published: Dec 1, 2019

Abstract

Visual question answering automatically answers natural language questions according to the content of an image or video. The task is challenging because it requires the understanding of semantic information in the textual and visual channels, as well as their interplay. A typical solver is composed of three components: feature extraction from singular modality, feature fusion between visual and textual channels, and answer prediction based on...

Paper Fields

Paper Details

Title

Information fusion in visual question answering: A Survey

DOI

doi.org/10.1016/j.inffus.2019.03.005

Published Date

Dec 1, 2019

Journal

Information Fusion

Volume

52

Pages

268 - 280

Citation AnalysisPro

You’ll need to upgrade your plan to Pro

Looking to understand the true influence of a researcher’s work across journals & affiliations?

Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.

Learn more

Notes

History