Enhance Multimodal Transformer With External Label And In-domain Pretrain: Hateful Meme Challenge Winning Solution

Ron Zhu . Arxiv 2020 – 46 citations

[Paper]
Image Text Integration Model Architecture Visual Contextualization

Hateful meme detection is a new research area recently brought out that requires both visual, linguistic understanding of the meme and some background knowledge to performing well on the task. This technical report summarises the first place solution of the Hateful Meme Detection Challenge 2020, which extending state-of-the-art visual-linguistic transformers to tackle this problem. At the end of the report, we also point out the shortcomings and possible directions for improving the current methodology.

Awesome LLM Papers

Stay Updated

Enhance Multimodal Transformer With External Label And In-domain Pretrain: Hateful Meme Challenge Winning Solution

Ron Zhu . Arxiv 2020 – 46 citations

Similar Work