Information fusion in visual question answering: A Survey
Abstract
Visual question answering automatically answers natural language questions according to the content of an image or video. The task is challenging because it requires the understanding of semantic information in the textual and visual channels, as well as their interplay. A typical solver is composed of three components: feature extraction from singular modality, feature fusion between visual and textual channels, and answer prediction based on...
Paper Details
Title
Information fusion in visual question answering: A Survey
Published Date
Dec 1, 2019
Journal
Volume
52
Pages
268 - 280
Citation AnalysisPro
You’ll need to upgrade your plan to Pro
Looking to understand the true influence of a researcher’s work across journals & affiliations?
- Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
- Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.
Notes
History