This story draft by @escholar has not been reviewed by an editor, YET.

Understanding Counterspeech for Online Harm Mitigation: Conclusion, Acknowledgements, and References

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture

Authors:

(1) Yi-Ling Chung, The Alan Turing Institute ([email protected]);

(2) Gavin Abercrombie, The Interaction Lab, Heriot-Watt University ([email protected]);

(3) Florence Enock, The Alan Turing Institute ([email protected]);

(4) Jonathan Bright, The Alan Turing Institute ([email protected]);

(5) Verena Rieser, The Interaction Lab, Heriot-Watt University and now at Google DeepMind ([email protected]).

Table of Links

Abstract and 1 Introduction

2 Background

3 Review Methodology

4 Defining counterspeech

4.1 Classifying counterspeech

5 The Impact of Counterspeech

6 Computational Approaches to Counterspeech and 6.1 Counterspeech Datasets

6.2 Approaches to Counterspeech Detection and 6.3 Approaches to Counterspeech Generation

7 Future Perspectives

8 Conclusion, Acknowledgements, and References

8 Conclusion

Online hate speech is a pressing global issue, prompting scientists and practitioners to examine potential solutions. Counterspeech, content that directly rebuts hateful content, is one promising avenue. While AI researchers are already beginning to explore opportunities to automate the generation of counterspeech for the mitigation of hate at scale, research from the social sciences points to many nuances that need to be considered regarding the impact of counterspeech before this intervention is deployed. Taking an interdisciplinary approach, we have attempted to synthesize the growing body of work in the field. Through our analysis of extant work, we suggest that findings regarding the efficacy of counterspeech are highly dependent on several factors, including methodological ones such as study design and outcome measures, and features of counterspeech such as the speaker, target of hate, and strategy employed. While some work finds counterspeech to be effective in lowering further hate generation from the perpetrator and raising feelings of empowerment in bystanders and targets, others find that counterspeech can backfire and encourage more hate. To understand the advantages and disadvantages of counterspeech more deeply, we suggest that empirical research should focus on testing counterspeech interventions in real-world settings which are scalable, durable, reliable, and specific. Researchers should agree on key outcome variables of interest in order to understand the optimal social conditions for producing counterspeech at scale by automating its generation. We hope that this review helps make sense of the variety of types of counterspeech that have been studied to date and prompts future collaborations between social and computer scientists working to ameliorate the negative effects of online hate.

Acknowledgements

We thank Bertie Vidgen for the valuable feedback on the initial structure of this manuscript and Hannah Rose Kirk for her help with the collection of target literature.

References

Adak, S., Chakraborty, S., Das, P., Das, M., Dash, A., Hazra, R., Mathew, B., Saha, P., Sarkar, S., and Mukherjee, A. (2022). Mining the online infosphere: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(5):e1453.


Alsagheer, D., Mansourifar, H., and Shi, W. (2022). Counter hate speech in social media: A survey. arXiv preprint arXiv:2203.03584.


Ardia, D. S. (2009). Free speech savior or shield for scoundrels: an empirical study of intermediary immunity under section 230 of the communications decency act. Loy. LAL Rev., 43:373.


Ashida, M. and Komachi, M. (2022). Towards automatic generation of messages countering online hate speech and microaggressions. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)


Bartlett, J. and Krasodomski-Jones, A. (2015). Counter-speech examining content that challenges extremism online. DEMOS, October.


Belz, A., Thomson, C., and Reiter, E. (2023). Missing information, unresponsive authors, experimental flaws: The impossibility of assessing the reproducibility of previous human evaluations in NLP. In The Fourth Workshop on Insights from Negative Results in NLP, pages 1–10, Dubrovnik, Croatia. Association for Computational Linguistics.


Benesch, S. (2014a). Countering dangerous speech: New ideas for genocide prevention. Washington, DC: US Holocaust Memorial Museum


Benesch, S. (2014b). Defining and diminishing hate speech. State of the world’s minorities and indigenous peoples, 2014:18–25.


Benesch, S., Ruths, D., Dillon, K. P., Saleem, H. M., and Wright, L. (2016). Counterspeech on Twitter: A field study. Dangerous Speech Project.


Berend, G. (2022). Combating the curse of multilinguality in cross-lingual WSD by aligning sparse contextualized word representations. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2459–2471, Seattle, United States. Association for Computational Linguistics.


Bertoldi, N., Cettolo, M., and Federico, M. (2013). Cache-based online adaptation for machine translation enhanced computer assisted translation. In MT-Summit, pages 35–42.


Bilewicz, M., Tempska, P., Leliwa, G., Dowgiałło, M., Tanska, M., Urbaniak, R., and Wroczy ´ nski, M. ´ (2021). Artificial intelligence against hate: Intervention reducing verbal aggression in the social network environment. Aggressive Behavior, 47(3):260–266.


Birhane, A., Isaac, W., Prabhakaran, V., Diaz, M., Elish, M. C., Gabriel, I., and Mohamed, S. (2022). Power to the people? Opportunities and challenges for participatory AI. In Equity and Access in Algorithms, Mechanisms, and Optimization, EAAMO ’22, New York, NY, USA. Association for Computing Machinery.


Blodgett, S. L., Barocas, S., Daumé III, H., and Wallach, H. (2020). Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476, Online. Association for Computational Linguistics.


Bonaldi, H., Dellantonio, S., Tekiroglu, S. S., and Guerini, M. (2022). Human-machine collaboration ˘ approaches to build a dialogue dataset for hate speech countering. arXiv preprint arXiv:2211.03433.


Buerger, C. (2021a). Counterspeech: A literature review. Available at SSRN 4066882.


Buerger, C. (2021b). #iamhere: Collective counterspeech and the quest to improve online discourse. Social Media + Society, 7(4):20563051211063843.


Buerger, C. (2022). Why they do it: Counterspeech theories of change. Available at SSRN 4245211.


Bélanger, J. J., Nisa, C. F., Schumpe, B. M., Gurmu, T., Williams, M. J., and Putra, I. E. (2020). Do counter-narratives reduce support for isis? yes, but not for their target audience. Frontiers in Psychology, 11.


Carthy, S. L., Doody, C. B., Cox, K., O’Hora, D., and Sarma, K. M. (2020). Counter-narratives for the prevention of violent radicalisation: A systematic review of targeted interventions. Campbell Systematic Reviews, 16(3):e1106.


Carthy, S. L. and Sarma, K. M. (2021). Countering terrorist narratives: Assessing the efficacy and mechanisms of change in counter-narrative strategies. Terrorism and Political Violence, 0(0):1–25.


Cettolo, M., Bertoldi, N., and Federico, M. (2014). The repetition rate of text as a predictor of the effectiveness of machine translation adaptation. In Proceedings of the 11th Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2014), pages 166–179.


Chakravarthi, B. R. (2022). Multilingual hope speech detection in english and dravidian languages. International journal of data science and analytics, 14(4):389—406.


Chaudhary, M., Saxena, C., and Meng, H. (2021). Countering online hate speech: An nlp perspective. arXiv preprint arXiv:2109.02941.


Chung, Y.-L., Guerini, M., and Agerri, R. (2021a). Multilingual counter narrative type classification. In Proceedings of the 8th Workshop on Argument Mining, pages 125–132, Punta Cana, Dominican Republic. Association for Computational Linguistics.


Chung, Y.-L., Kuzmenko, E., Tekiroglu, S. S., and Guerini, M. (2019). CONAN - COunter NArratives through nichesourcing: a multilingual dataset of responses to fight online hate speech. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2819–2829, Florence, Italy. Association for Computational Linguistics.


Chung, Y.-L., Tekiroglu, S. S., and Guerini, M. (2020). Italian counter narrative generation to fight ˘ online hate speech. In Proceedings of the Seventh Italian Conference on Computational Linguistics, Online.


Chung, Y.-L., Tekiroglu, S. S., and Guerini, M. (2021b). Towards knowledge-grounded counter ˘ narrative generation for hate speech. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 899–914, Online. Association for Computational Linguistics.


Chung, Y.-L., Tekiroglu, S. S., Tonelli, S., and Guerini, M. (2021c). Empowering ngos in countering ˘ online hate messages. Online Social Networks and Media, 24:100150.


Citron, D. K. and Norton, H. (2011). Intermediaries and hate speech: Fostering digital citizenship for our information age. BUL Rev., 91:1435.


Collaboration, O. S. (2015). Estimating the reproducibility of psychological science. Science, 349(6251):aac4716.


Commission, E. (2017). Communication from the commission to the european parliament, the council, the european economic and social committee and the committee of the regions.


Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., and Stoyanov, V. (2020). Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.


de la Cueva, J. and Méndez, E. (2022). Open science and intellectual property rights. how can they better interact? state of the art and reflections. report of study. european commission.


Dennis, S., Garrett, P., Yim, H., Hamm, J., Osth, A. F., Sreekumar, V., and Stone, B. (2019). Privacy versus open science. Behavior research methods, 51:1839–1848.


Derksen, M. and Morawski, J. (2022). Kinds of replication: Examining the meanings of “conceptual replication” and “direct replication”. Perspectives on Psychological Science, 17(5):1490–1505. PMID: 35245130.


Dinan, E., Abercrombie, G., Bergman, A., Spruit, S., Hovy, D., Boureau, Y.-L., and Rieser, V. (2022). SafetyKit: First aid for measuring safety in open-domain conversational systems. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4113–4133, Dublin, Ireland. Association for Computational Linguistics.


Dušek, O. and Kasner, Z. (2020). Evaluating semantic accuracy of data-to-text generation with natural language inference. In Proceedings of the 13th International Conference on Natural Language Generation, pages 131–137, Dublin, Ireland. Association for Computational Linguistics.


Ernst, J., Schmitt, J. B., Rieger, D., Beier, A. K., Vorderer, P., Bente, G., and Roth, H.-J. (2017). Hate beneath the counter speech? A qualitative content analysis of user comments on youtube related to counter speech videos. Journal for Deradicalization, (10):1–49.


Fanton, M., Bonaldi, H., Tekiroglu, S. S., and Guerini, M. (2021). Human-in-the-loop for data ˘ collection: a multi-target counter narrative dataset to fight online hate speech. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3226–3240, Online. Association for Computational Linguistics.


Ferguson, K. (2016). Countering violent extremism through media and communication strategies: A review of the evidence.


Fortuna, P., Soler-Company, J., and Wanner, L. (2021). How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Information Processing & Management, 58(3):102524.


Frenkel, S. and Conger, K. (2022). Hate Speech’s Rise on Twitter Is Unprecedented, Researchers Find. The New York Times.


Garland, J., Ghazi-Zahedi, K., Young, J.-G., Hébert-Dufresne, L., and Galesic, M. (2020). Countering hate on social media: Large scale classification of hate and counter speech. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 102–112, Online. Association for Computational Linguistics.


Garland, J., Ghazi-Zahedi, K., Young, J.-G., Hébert-Dufresne, L., and Galesic, M. (2022). Impact and dynamics of hate and counter speech online. EPJ Data Science, 11(1):3.


Gehman, S., Gururangan, S., Sap, M., Choi, Y., and Smith, N. A. (2020). RealToxicityPrompts: Evaluating neural toxic degeneration in language models. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3356–3369, Online. Association for Computational Linguistics.


Goffredo, P., Basile, V., Cepollaro, B., and Patti, V. (2022). Counter-TWIT: An Italian corpus for online counterspeech in ecological contexts. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 57–66, Seattle, Washington (Hybrid). Association for Computational Linguistics.


Goldman, E. (2018). An overview of the united states’ section 230 internet immunity. Available at SSRN 3306737.


Google Jigsaw (2022). Perspective API. Accessed: 26 May 2023.


Hangartner, D., Gennaro, G., Alasiri, S., Bahrich, N., Bornhoft, A., Boucher, J., Demirci, B. B., Derksen, L., Hall, A., Jochum, M., Munoz, M. M., Richter, M., Vogel, F., Wittwer, S., Wüthrich, F., Gilardi, F., and Donnay, K. (2021). Empathy-based counterspeech can reduce racist hate speech in a social media field experiment. Proceedings of the National Academy of Sciences, 118(50):e2116310118.


He, B., Ziems, C., Soni, S., Ramakrishnan, N., Yang, D., and Kumar, S. (2022). Racism is a virus: Anti-asian hate and counterspeech in social media during the covid-19 crisis. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’21, page 90–94, New York, NY, USA. Association for Computing Machinery.


Iqbal, K., Zafar, S. K., and Mehmood, Z. (2019). Critical evaluation of pakistan’s counter-narrative efforts. Journal of Policing, Intelligence and Counter Terrorism, 14(2):147–163.


Jay, T. (2009). Do offensive words harm people? Psychology, public policy, and law, 15(2):81.


Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y. J., Madotto, A., and Fung, P. (2023). Survey of hallucination in natural language generation. ACM Comput. Surv., 55(12).


Kennedy, C. J., Bacon, G., Sahn, A., and von Vacano, C. (2020). Constructing interval variables via faceted rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277.


Kirk, H., Birhane, A., Vidgen, B., and Derczynski, L. (2022). Handling and presenting harmful text in NLP research. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 497–510, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.


Leader Maynard, J. and Benesch, S. (2016). Dangerous speech and dangerous ideology: An integrated model for monitoring and prevention. Genocide Studies and Prevention, 9(3).


Lee, H., NA, Y. J., Song, H., Shin, J., and Park, J. C. (2022). Elf22: A context-based counter trolling dataset to combat internet trolls. In Proceedings of the 13th Language Resources and Evaluation, LREC 2022, Marseille, France, June 20-25, 2022, pages 3530–3541. European Language Resources Association.


Leonhard, L., Rueß, C., Obermaier, M., and Reinemann, C. (2018). Perceiving threat and feeling responsible. how severity of hate speech, number of bystanders, and prior reactions of others affect bystanders’ intention to counterargue against hate speech on facebook. Studies in Communication and Media, 7(4):555–579.


Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.


Lin, H., Nalluri, P., Li, L., Sun, Y., and Zhang, Y. (2022). Multiplex anti-Asian sentiment before and during the pandemic: Introducing new datasets from Twitter mining. In Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, pages 16–24, Dublin, Ireland. Association for Computational Linguistics.


Litaker, J. R., Lopez Bray, C., Tamez, N., Durkalski, W., and Taylor, R. (2022). Covid-19 vaccine acceptors, refusers, and the moveable middle: A qualitative study from central texas. Vaccines, 10(10).


Liu, C.-W., Lowe, R., Serban, I., Noseworthy, M., Charlin, L., and Pineau, J. (2016). How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2122–2132, Austin, Texas. Association for Computational Linguistics.


Mathew, B., Kumar, N., Goyal, P., Mukherjee, A., et al. (2018). Analyzing the hate and counter speech accounts on Twitter. arXiv:1812.02712.


Mathew, B., Saha, P., Tharad, H., Rajgaria, S., Singhania, P., Maity, S. K., Goyal, P., and Mukherjee, A. (2019). Thou shalt not hate: Countering online hate speech. In Proceedings of the International AAAI Conference on Web and Social Media, volume 13, pages 369–380.


Moher, D., Liberati, A., Tetzlaff, J., and Altman, D. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal Medicine, 151(4):264–269. PMID: 19622511.


Munger, K. (2017). Tweetment effects on the tweeted: Experimentally reducing racist harassment. Political Behavior, 39(3):629–649.


News, B. (2018). MPs ‘being advised to quit Twitter’ to avoid online abuse. BBC News.


Nie, F., Yao, J.-G., Wang, J., Pan, R., and Lin, C.-Y. (2019). A simple recipe towards reducing hallucination in neural surface realisation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2673–2679, Florence, Italy. Association for Computational Linguistics.


Novikova, J., Dušek, O., Cercas Curry, A., and Rieser, V. (2017). Why we need new evaluation metrics for NLG. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2241–2252, Copenhagen, Denmark. Association for Computational Linguistics.


Obermaier, M., Schmuck, D., and Saleem, M. (2021). I’ll be there for you? effects of islamophobic online hate speech and counter speech on muslim in-group bystanders’ intention to intervene. New Media & Society, 0(0):14614448211017527.


Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318. Association for Computational Linguistics.


Poole, E., Giraud, E. H., and de Quincey, E. (2021). Tactical interventions in online hate speech: The case of #stopislam. New Media & Society, 23(6):1415–1442.


Priyadharshini, R., Chakravarthi, B. R., Cn, S., Durairaj, T., Subramanian, M., Shanmugavadivel, K., U Hegde, S., and Kumaresan, P. (2022). Overview of abusive comment detection in Tamil-ACL 2022. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 292–298, Dublin, Ireland. Association for Computational Linguistics.


Procter, R., Webb, H., Jirotka, M., Burnap, P., Housley, W., Edwards, A., and Williams, M. (2019). A study of cyber hate on twitter with implications for social media governance strategies. arXiv preprint arXiv:1908.11732.


Qian, J., Bethke, A., Liu, Y., Belding, E., and Wang, W. Y. (2019). A benchmark dataset for learning to intervene in online hate speech. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4755–4764, Hong Kong, China. Association for Computational Linguistics.


Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8).


Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.


Reynolds, L. and Tuck, H. (2016). The counter-narrative monitoring & evaluation handbook. Institute for Strategic Dialogue.


Riedl, M. J., Masullo, G. M., and Whipple, K. N. (2020). The downsides of digital labor: Exploring the toll incivility takes on online comment moderators. Computers in Human Behavior, 107:106262.


Saha, K., Chandrasekharan, E., and De Choudhury, M. (2019). Prevalence and psychological effects of hateful speech in online college communities. In Proceedings of the 10th ACM Conference on Web Science, WebSci ’19, page 255–264, New York, NY, USA. Association for Computing Machinery.


Saha, P., Singh, K., Kumar, A., Mathew, B., and Mukherjee, A. (2022). Countergedi: A controllable approach to generate polite, detoxified and emotional counterspeech.


Saltman, E., Kooti, F., and Vockery, K. (2021). New models for deploying counterspeech: Measuring behavioral change and sentiment analysis. Studies in Conflict & Terrorism, 0(0):1–24.


Saltman, E. M. and Russell, J. (2014). White paper–the role of Prevent in countering online extremism. Quilliam publication.


Sap, M., Swayamdipta, S., Vianna, L., Zhou, X., Choi, Y., and Smith, N. A. (2022). Annotators with attitudes: How annotator beliefs and identities bias toxic language detection. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5884–5906, Seattle, United States. Association for Computational Linguistics.


Schieb, C. and Preuss, M. (2016). Governing hate speech by means of counterspeech on Facebook. In 66th ICA annual conference, at Fukuoka, Japan, pages 1–23.


Siegel, A. A. (2020). Online hate speech. Social media and democracy: The state of the field, prospects for reform, pages 56–88.


Silverman, T., Stewart, C. J., Birdwell, J., and Amanullah, Z. (2016). The impact of counter-narratives. Institute for Strategic Dialogue.


Simmons, J. P., Nelson, L. D., and Simonsohn, U. (2012). A 21 word solution. Available at SSRN 2160588.


Smits, J. M. (2000). The good samaritan in european private law; on the perils of principles without a programme and a programme for the future.


Snyder, C. R., Rand, K. L., and Sigmon, D. R. (2018). Hope Theory: A Member of the Positive Psychology Family. In The Oxford Handbook of Hope, pages 257–276. Oxford University Press.


Solaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., Radford, A., Krueger, G., Kim, J. W., Kreps, S., et al. (2019). Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203.


Stroebe, W. (2008). Strategies of attitude and behaviour change.


Stroebe, W., Postmes, T., and Spears, R. (2012). Scientific misconduct and the myth of self-correction in science. Perspectives on Psychological Science, 7(6):670–688.


Stroud, S. R. and Cox, W. (2018). The Varieties of Feminist Counterspeech in the Misogynistic Online World, pages 293–310. Springer International Publishing, Cham.


Tekiroglu, S. S., Bonaldi, H., Fanton, M., and Guerini, M. (2022). Using pre-trained language ˘ models for producing counter narratives against hate speech: a comparative study. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3099–3114, Dublin, Ireland. Association for Computational Linguistics.


Tekiroglu, S. S., Chung, Y.-L., and Guerini, M. (2020). Generating counter narratives against online ˘ hate speech: Data and strategies. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1177–1190, Online. Association for Computational Linguistics.


Toliyat, A., Levitan, S. I., Peng, Z., and Etemadpour, R. (2022). Asian hate speech detection on twitter during covid-19. Frontiers in Artificial Intelligence, 5.


Tuck, H. and Silverman, T. (2016). The counter-narrative handbook. Institute for Strategic Dialogue.


Vidgen, B., Hale, S., Guest, E., Margetts, H., Broniatowski, D., Waseem, Z., Botelho, A., Hall, M., and Tromble, R. (2020). Detecting East Asian prejudice on social media. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 162–172, Online. Association for Computational Linguistics.


Vidgen, B., Margetts, H., and Harris, A. (2019). How much online abuse is there? A systematic review of evidence for the UK. Alan Turing Institute Policy Briefing.


Vidgen, B., Nguyen, D., Margetts, H., Rossini, P., and Tromble, R. (2021). Introducing CAD: the contextual abuse dataset. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2289–2303, Online. Association for Computational Linguistics.


Wang, K. and Wan, X. (2018). Sentigan: Generating sentimental texts via mixture adversarial networks. In IJCAI, pages 4446–4452.


Wright, L., Ruths, D., Dillon, K. P., Saleem, H. M., and Benesch, S. (2017). Vectors for counterspeech on twitter. In Proceedings of the First Workshop on Abusive Language Online, pages 57–62.


Yin, W. and Zubiaga, A. (2021). Towards generalisable hate speech detection: a review on obstacles and solutions. PeerJ Computer Science, 7:e598.


Yu, X., Blanco, E., and Hong, L. (2022). Hate speech and counter speech detection: Conversational context does matter. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5918–5930, Seattle, United States. Association for Computational Linguistics.


Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F., and Choi, Y. (2019). Defending against neural fake news. In Wallach, H., Larochelle, H., Beygelzimer, A., d'AlchéBuc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.


Zhou, C., Neubig, G., Gu, J., Diab, M., Guzmán, F., Zettlemoyer, L., and Ghazvininejad, M. (2021). Detecting hallucinated content in conditional neural sequence generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1393–1404, Online. Association for Computational Linguistics.


Zhu, W. and Bhat, S. (2021). Generate, prune, select: A pipeline for counterspeech generation against online hate speech. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 134–149, Online. Association for Computational Linguistics.


This paper is available on arxiv under CC BY-SA 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks