Timezone: »

Connective Cognition Network for Directional Visual Commonsense Reasoning
Aming Wu · Linchao Zhu · Yahong Han · Yi Yang

Wed Dec 11 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #113
Visual commonsense reasoning (VCR) has been introduced to boost research of cognition-level visual understanding, i.e., a thorough understanding of correlated details of the scene plus an inference with related commonsense knowledge. Recent studies on neuroscience have suggested that brain function or cognition can be described as a global and dynamic integration of local neuronal connectivity, which is context-sensitive to specific cognition tasks. Inspired by this idea, towards VCR, we propose a connective cognition network (CCN) to dynamically reorganize the visual neuron connectivity that is contextualized by the meaning of questions and answers. Concretely, we first develop visual neuron connectivity to fully model correlations of visual content. Then, a contextualization process is introduced to fuse the sentence representation with that of visual neurons. Finally, based on the output of contextualized connectivity, we propose directional connectivity to infer answers or rationales. Experimental results on the VCR dataset demonstrate the effectiveness of our method. Particularly, in $Q \to AR$ mode, our method is around 4\% higher than the state-of-the-art method.

Author Information

Aming Wu (Tianjin University)
Linchao Zhu (University of Sydney Technology)
Yahong Han (Tianjin University, China)
Yi Yang (UTS)

More from the Same Authors