This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Shreyash Mishra has an equal contribution from IIIT Sri City, India;
(2) S Suryavardan has an equal contribution from IIIT Sri City, India;
(3) Megha Chakraborty, University of South Carolina, USA;
(4) Parth Patwa, UCLA, USA;
(5) Anku Rani, University of South Carolina, USA;
(6) Aman Chadha, work does not relate to a position at Amazon from Stanford University, USA, or Amazon AI, USA;
(7) Aishwarya Reganti, CMU, USA;
(8) Amitava Das, University of South Carolina, USA;
(9) Amit Sheth, University of South Carolina, USA;
(10) Manoj Chinnakotla, Microsoft, USA;
(11) Asif Ekbal, IIT Patna, India;
(12) Srijan Kumar, Georgia Tech, USA.
Conclusion and Future Work and References
In this paper, we summarize the approaches used by the participants for the Memotion 3 task and analyze the results. Due to the multi-modal nature of the dataset, all teams use a pre-trained image and text embedding models. However, each team presents a novel model pipeline. The highest scores achieved in Task A, Task B and Task C of Memotion 3.0 are 34.41%, 79.77% and 59.82% respectively, which shows there is significant room for improvement. On analysis of the results and the mis-classified examples on the test set, we find that "Sarcasm" and "Humour" are difficult to identify, especially in code-mixed memes.
While we address Hind-English code-mixed memes in this paper, future work could include exploring other languages/language pairs. A unified baseline model to analyze memes in multiple languages could also be an interesting possibility.
[1] R. Dawkins, The selfish gene, Granada Publishing Lim, 1979.
[2] A. Marwick, Memes, Contexts 12 (2013) 12–13.
[3] N. Akhther, Internet memes as form of cultural discourse: A rhetorical analysis on facebook, 2018. doi:10.31234/osf.io/sx6t7.
[4] S. Suryawanshi, B. R. Chakravarthi, M. Arcan, P. Buitelaar, Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text, in: TRAC, 2020.
[5] S. Mishra, S. Suryavardan, P. Patwa, M. Chakraborty, A. Rani, A. Reganti, A. Chadha, A. Das, A. Sheth, M. Chinnakotla, et al., Memotion 3: Dataset on sentiment and emotion analysis of codemixed hindi-english memes, arXiv preprint arXiv:2303.09892 (2023).
[6] C. Sharma, D. Bhageria, W. Scott, S. PYKL, A. Das, T. Chakraborty, et al., SemEval-2020 task 8: Memotion analysis- the visuo-lingual metaphor!, in: SemEval, 2020.
[7] P. Patwa, S. Ramamoorthy, N. Gunti, S. Mishra, S. Suryavardan, A. Reganti, A. Das, T. Chakraborty, A. Sheth, A. Ekbal, et al., Findings of memotion 2: Sentiment and emotion analysis of memes, in: Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, ceur, 2022.
[8] R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, C. Potts, Recursive deep models for semantic compositionality over a sentiment treebank, in: EMNLP, 2013.
[9] A. I. Saad, Opinion mining on us airline twitter data using machine learning techniques, in: 2020 16th international computer engineering conference (ICENCO), IEEE, 2020.
[10] M. Alzyout, E. A. Bashabsheh, H. Najadat, A. Alaiad, Sentiment analysis of arabic tweets about violence against women using machine learning, in: 12th ICICS, 2021.
[11] E. Prabhakar, M. Santhosh, A. H. Krishnan, T. Kumar, R. Sudhakar, Sentiment analysis of us airline twitter data using new adaboost approach, (IJERT) 7 (2019).
[12] S. T. Kokab, S. Asghar, S. Naz, Transformer-based deep learning models for the sentiment analysis of social media data, Array 14 (2022) 100157.
[13] S. G. Tesfagergish, J. Kapočiut¯ e-Dzikien ̇ e, R. Damaševičius, Zero-shot emotion detection ̇ for semi-supervised sentiment analysis using sentence transformers and ensemble learning, Applied Sciences (2022).
[14] K. L. Tan, C. P. Lee, K. M. Lim, K. S. M. Anbananthen, Sentiment analysis with ensemble hybrid deep learning model, IEEE Access 10 (2022) 103694–103704.
[15] L. Yue, W. Chen, X. Li, W. Zuo, M. Yin, A survey of sentiment analysis in social media, Knowledge and Information Systems 60 (2019).
[16] K. Chakraborty, S. Bhattacharyya, R. Bag, A survey of sentiment analysis from social media data, IEEE Transactions on Computational Social Systems 7 (2020) 450–464.
[17] L. Chiruzzo, S. Castro, M. Etcheverry, D. Garat, J. J. Prada, A. Rosá, Overview of haha at iberlef 2019: Humor analysis based on human annotation., in: IberLEF@ SEPLN, 2019.
[18] E. Öhman, M. Pàmies, K. Kajava, J. Tiedemann, Xed: A multilingual dataset for sentiment analysis and emotion detection, 2020. arXiv:2011.01612.
[19] F. A. Acheampong, C. Wenyu, H. Nunoo-Mensah, Text-based emotion detection: Advances, challenges, and opportunities, Engineering Reports (2020).
[20] Z. Waseem, D. Hovy, Hateful symbols or hateful people? predictive features for hate speech detection on Twitter, in: NAACL, 2016.
[21] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, et al., Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval), arXiv:1903.08983 (2019).
[22] P. Patwa, M. Bhardwaj, V. Guptha, G. Kumari, S. Sharma, S. PYKL, A. Das, A. Ekbal, M. S. Akhtar, T. Chakraborty, Overview of constraint 2021 shared tasks: Detecting english covid-19 fake news and hindi hostile posts, in: Combating Online Hostile Posts in Regional Languages during Emergency Situation, Springer International Publishing, 2021, pp. 42–53.
[23] R. Kumar, A. K. Ojha, S. Malmasi, M. Zampieri, Benchmarking aggression identification in social media, in: TRAC workshop, 2018.
[24] R. Kumar, A. K. Ojha, S. Malmasi, M. Zampieri, Evaluating aggression identification in social media, in: TRAC workshop, 2020.
[25] R. Kumar, S. Ratan, S. Singh, E. Nandi, L. N. Devi, et al., The ComMA dataset v0.2: Annotating aggression and bias in multilingual social media discourse, in: LREC, 2022.
[26] B. Gambäck, U. K. Sikdar, Using convolutional neural networks to classify hate-speech, in: Proceedings of the first workshop on abusive language online, 2017, pp. 85–90.
[27] A. Ribeiro, N. Silva, Inf-hateval at semeval-2019 task 5: Convolutional neural networks for hate speech detection against women and immigrants on twitter, in: SemEval, 2019.
[28] K. Winter, R. Kern, Know-center at semeval-2019 task 5: multilingual hate speech detection on twitter using cnns, in: Semeval, 2019.
[29] P. Patwa, S. Pykl, A. Das, P. Mukherjee, V. Pulabaigari, Hater-O-genius aggression classification using capsule networks, in: Proceedings of the 17th International Conference on Natural Language Processing (ICON), 2020.
[30] A. C. Mazari, N. Boudoukhani, A. Djeffal, Bert-based ensemble learning for multi-aspect hate speech detection, Cluster Computing (2023) 1–15.
[31] N. S. Samghabadi, P. Patwa, S. Pykl, P. Mukherjee, A. Das, T. Solorio, Aggression and misogyny detection using bert: A multi-task approach, in: Proceedings of the second workshop on trolling, aggression and cyberbullying, 2020.
[32] J. Risch, R. Krestel, Bagging BERT models for robust aggression identification, in: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, 2020.
[33] S. Nagar, F. A. Barbhuiya, K. Dey, Towards more robust hate speech detection: using social context and user data, Social Network Analysis and Mining 13 (2023) 47.
[34] A. Laddha, M. Hanoosh, D. Mukherjee, P. Patwa, A. Narang, Understanding chat messages for sticker recommendation in messaging apps, Proceedings of the AAAI Conference on Artificial Intelligence 34 (2020). doi:10.1609/aaai.v34i08.7019.
[35] A. Laddha, M. Hanoosh, D. Mukherjee, P. Patwa, A. Narang, Large scale multilingual sticker recommendation in messaging apps, AI Magazine 42 (2022). doi:10.1609/aaai.12023.
[36] P. Patwa, G. Aguilar, S. Kar, S. Pandey, S. PYKL, B. Gambäck, T. Chakraborty, T. Solorio, A. Das, SemEval-2020 task 9: Overview of sentiment analysis of code-mixed tweets, in: Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020.
[37] B. R. Chakravarthi, et al., Findings of the shared task on offensive language identification in Tamil, Malayalam, and Kannada, in: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, 2021.
[38] B. R. Chakravarthi, V. Muralidaran, R. Priyadharshini, J. P. McCrae, Corpus creation for sentiment analysis in code-mixed tamil-english text, arXiv:2006.00206 (2020).
[39] B. R. Chakravarthi, N. Jose, S. Suryawanshi, E. Sherly, J. P. McCrae, A sentiment analysis dataset for code-mixed malayalam-english, arXiv preprint arXiv:2006.00210 (2020).
[40] A. Hande, R. Priyadharshini, B. R. Chakravarthi, KanCMD: Kannada CodeMixed dataset for sentiment analysis and offensive language detection, in: Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s in Social Media, 2020.
[41] S. Dowlagar, R. Mamidi, Graph convolutional networks with multi-headed attention for code-mixed sentiment analysis, in: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, 2021, pp. 65–72.
[42] J. Risch, A. Stoll, M. Ziegele, R. Krestel, hpidedis at germeval 2019: Offensive language identification using a german bert model., in: KONVENS, 2019.
[43] D. Tula, P. Potluri, S. Ms, S. Doddapaneni, P. Sahu, R. Sukumaran, P. Patwa, Bitions@DravidianLangTech-EACL2021: Ensemble of multilingual language models with pseudo labeling for offence detection in Dravidian languages, in: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, 2021.
[44] D. Tula, M. S. Shreyas, V. Reddy, P. Sahu, S. Doddapaneni, P. Potluri, R. Sukumaran, P. Patwa, Offence detection in dravidian languages using code-mixing index-based focal loss, SN Computer Science 3 (2022). doi:10.1007/s42979-022-01190-1.
[45] Y. Ma, L. Zhao, J. Hao, XLP at SemEval-2020 task 9: Cross-lingual models with focal loss for sentiment analysis of code-mixing language, in: Semeval, 2020.
[46] M. Ali, S. T. Kandukuri, S. Manduru, P. Patwa, A. Das, Pesto: Switching point based dynamic and relative positional encoding for code-mixed languages (student abstract), AAAI (2022). doi:10.1609/aaai.v36i11.21587.
[47] A. Hu, S. Flaxman, Multimodal sentiment analysis to explore the structure of emotions, in: KDD, 2018.
[48] R. Jha, V. Kaki, V. Kolla, S. Bhagat, P. Patwa, A. Das, S. Pal, Image2tweet: Datasets in Hindi and English for generating tweets from images, in: Proceedings of the 18th International Conference on Natural Language Processing (ICON), 2021.
[49] R. Gomez, J. Gibert, L. Gomez, D. Karatzas, Exploring hate speech detection in multimodal publications, 2019. arXiv:1910.03814.
[50] B. Zadeh, et al., Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph, in: ACL, 2018.
[51] D. Kiela, H. Firooz, A. Mohan, V. Goswami, A. Singh, P. Ringshia, D. Testuggine, The hateful memes challenge: Detecting hate speech in multimodal memes, Neurips (2020).
[52] S. Suryawanshi, B. R. Chakravarthi, Findings of the shared task on troll meme classification in Tamil, in: Speech and Language Technologies for Dravidian Languages, 2021.
[53] E. Hossain, O. Sharif, M. M. Hoque, MUTE: A multimodal dataset for detecting hateful memes, in: Proceedings of the 2nd AACL Student Research Workshop, 2022.
[54] Z. Xie, L. Liu, Y. Wu, L. Zhong, L. Li, Learning text-image joint embedding for efficient cross-modal retrieval with deep feature engineering, ACM Transactions on Information Systems 40 (2021) 1–27. URL: https://doi.org/10.1145%2F3490519. doi:10.1145/3490519.
[55] V. Krishna, S. Suryavardan, S. Mishra, S. Ramamoorthy, P. Patwa, M. Chakraborty, A. Chadha, A. Das, A. Sheth, Imaginator: Pre-trained image+text joint embeddings using word-level grounding of images, 2023. arXiv:2305.10438.
[56] N. Gunti, S. Ramamoorthy, P. Patwa, A. Das, Memotion analysis through the lens of joint embedding (student abstract), Proceedings of the AAAI Conference on Artificial Intelligence 36 (2022). doi:10.1609/aaai.v36i11.21616.
[57] J. Lu, D. Batra, D. Parikh, S. Lee, Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks, 2019. arXiv:1908.02265.
[58] L. H. Li, M. Yatskar, D. Yin, C.-J. Hsieh, K.-W. Chang, Visualbert: A simple and performant baseline for vision and language, 2019. arXiv:1908.03557.
[59] S. Ramamoorthy, N. Gunti, S. Mishra, S. Suryavardan, A. Reganti, P. Patwa, A. Das, T. Chakraborty, A. Sheth, A. Ekbal, et al., Memotion 2: Dataset on sentiment and emotion analysis of memes, in: Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, 2022.
[60] K. N. Phan, G.-S. Lee, H.-J. Yang, S.-H. Kim, Little flower at memotion 2.0 2022: Ensemble of multi-modal model using attention mechanism in memotion analysis, in: Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, 2022.
[61] T. Morishita, G. Morio, S. Horiguchi, H. Ozaki, T. Miyoshi, Hitachi at SemEval-2020 task 8: Simple but effective modality ensemble for meme emotion recognition, in: SemEval, 2020.
[62] Y. Guo, J. Huang, Y. Dong, M. Xu, Guoym at SemEval-2020 task 8: Ensemble-based classification of visuo-lingual metaphor in memes, in: SemEval, 2020.
[63] G.-A. Vlad, G.-E. Zaharia, D.-C. Cercel, et al., Upb at semeval-2020 task 8: Joint textual and visual modeling in a multi-task learning architecture for memotion analysis, 2020. arXiv:2009.02779.
[64] T. T. Nguyen, N. T. Pham, N. D. Nguyen, et al., Hcilab at memotion 2.0 2022: Analysis of sentiment, emotion and intensity of emotion classes from meme images using single and multi modalities, in: Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, 2022.
[65] G. G. Lee, M. Shen, Amazon pars at memotion 2.0 2022: Multi-modal multi-task learning for memotion 2.0 challenge, Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection (2020).
[66] M. Bhange, N. Kasliwal, Hinglishnlp: Fine-tuned language models for hinglish sentiment detection, arXiv preprint arXiv:2008.09820 (2020).
[67] A. Dosovitskiy, L. Beyer, A. Kolesnikov, et al., An image is worth 16x16 words: Transformers for image recognition at scale, arXiv:2010.11929 (2020).
[68] W. Yu, D. Kolossa, wentaorub at Memotion 3: Ensemble learning for multi-modal meme classification, in: Proceedings of De-Factify 2: Workshop on Multimodal Fact Checking and Hate Speech Detection, CEUR, 2023.
[69] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, et al., Learning transferable visual models from natural language supervision, in: ICML, 2021.
[70] X. Li, X. Yin, C. Li, P. Zhang, et al., Oscar: Object-semantics aligned pre-training for vision-language tasks, 2020. arXiv:2004.06165.
[71] Y.-C. Tang, K.-D. Wang, T.-Y. Ou, W.-C. Peng, NYCU_TWO at Memotion 3: Good foundation, good teacher, then you have good meme analysis, in: Proceedings of De-Factify 2: Workshop on Multimodal Fact Checking and Hate Speech Detection, CEUR, 2023.
[72] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, 2021. arXiv:2103.14030.
[73] X. Guo, J. Ma, A. Zubiaga, NUAA-QMUL-AIIT at Memotion 3: Multi-modal fusion with squeeze-and-excitation for internet meme emotion analysis, in: Proceedings of De-Factify 2: Workshop on Multimodal Fact Checking and Hate Speech Detection, CEUR, 2023.
[74] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov, Roberta: A robustly optimized bert pretraining approach (2019).
[75] X. Zhai, J. Puigcerver, A. Kolesnikov, et al., A large-scale study of representation learning with the visual task adaptation benchmark, 2020. arXiv:1910.04867.
[76] G. Ke, Q. Meng, T. Finley, et al., Lightgbm: A highly efficient gradient boosting decision tree, in: Neurips, 2017.
[77] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: CVPR, 2016.
[78] F. Schroff, D. Kalenichenko, J. Philbin, FaceNet: A unified embedding for face recognition and clustering, in: CVPR, 2015.