Compositional Temporal Visual Grounding of Natural Language Event Descriptions.
Abstract
Temporal grounding entails establishing a correspondence between natural language event descriptions and their visual depictions. Compositional modeling becomes central: we first ground atomic descriptions girl null an apple, batter hitting the ball to short video segments, and then establish the temporal relationships between the segments. This compositional structure enables models to recognize a wider variety of events not seen during...
Paper Details
Title
Compositional Temporal Visual Grounding of Natural Language Event Descriptions.
Published Date
Dec 4, 2019
Citation AnalysisPro
You’ll need to upgrade your plan to Pro
Looking to understand the true influence of a researcher’s work across journals & affiliations?
- Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
- Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.
Notes
History