
Authors – Bo Xu, Junzhe Zheng, Jiayuan He, Yuxuan Sun, Hongfei Lin, Liang Zhao, Feng Xia
Keywords – meme understanding, metaphor, multimodal
Summary – Understanding memes is challenging because they contain metaphorical information that requires deep interpretation. Previous studies have added human-annotated metaphors as textual features in machine learning models but often ignored the link between metaphors and corresponding visual elements. This paper proposes MMMC (Multimodal Metaphorical feature for Meme Classification), which jointly models both textual and visual features for better meme understanding. Using a text-conditioned generative adversarial network (GAN), MMMC generates visual features from linguistic cues of metaphorical concepts and integrates them for classification. Experiments on the MET-Meme dataset show that MMMC significantly outperforms existing methods in emotion classification and intention detection.
ACM MM 2024, Multimodal Reasoning & Inference