From virtual assistants to content moderation, sentiment analysis has a wide range of use cases. AI models that can recognize emotion and opinion have a myriad of applications in numerous industries. Therefore, there is a large growing interest in the creation of emotionally intelligent machines. The same can be said for the research being done in natural language processing (NLP). To highlight some of the work being done in the field, below are five essential papers on and sentiment classification. sentiment analysis 1. Deep Learning for Hate Speech Detection in Tweets One of the most useful applications of sentiment classification models is the detection of hate speech. Recently, there have been numerous reports of . With the advancement of automated hate speech detection and other content moderation models, hopefully human moderators filtering graphic content will no longer be necessary. the harsh lives of content moderation staff In this paper, the team defines their task of hate speech detection as classifying whether or not a particular Twitter post is racist, sexist, or neither. To do so, the researchers experiment on a dataset containing 16,000 tweets. Within the dataset, 1,972 of the tweets have been labeled as racist. 3,383 have been labeled as sexist. The remaining tweets have been classified as having neither racist nor sexist sentiment within them. In the end, the research shows that certain deep learning techniques prove more efficient than current n-gram methods for the detection of hate speech. – June 1st, 2017 Published / Last Updated – Pinkesh Badjatiya (IIIT-H) , Shashank Gupta (IIIT-H), Manish Gupta (Microsoft), Vasudeva Varma (IIIT-H) Authors and Contributors Read Now 2. DepecheMood++: a Bilingual Emotion Lexicon There are two main avenues through which you can acquire a lexicon: creation (often employing crowdsourced annotators), or derivation from a preexisting annotated corpora. In this paper, the researchers experiment to see if simple techniques i.e. document filtering, frequency cut-off, and text pre-processing could be used to improve a state-of-the-art lexicon called DepecheMood. The lexicon, made up of annotated news articles, was originally created by Staiano and Guerini in 2014 for emotion analysis. In this paper, the researchers explain how they have built upon the lexicon. The new version released in this study, DepecheMood++, is available in both English and Italian. – October 8th, 2018 Published / Last Updated – Oscar Araque (Polytechnic University of Madrid) , Lorenzo Gatti (University of Twente) , Marco Guerini (AdeptMind Scholar, Bruno Kessler Institute), Jacopo Staiano (Recital AI) Authors and Contributors Read Now 3. Expressively Vulgar: The Socio-dynamics of Vulgarity Considering that most thoughts can easily be reworded to not include vulgar language, the use of explicit words indicates a strong desire to send a specific message. In this study, researchers at the University of Texas and University of Pennsylvania conducted a large-scale, data-driven analysis of vulgar words in Twitter posts. More specifically, their research analyzes the socio-cultural and pragmatic aspects of vulgar language in tweets. Within this paper, the team seeks to answer the following questions: Is the expression of vulgarity and its function different across author demographic traits? Does vulgarity impact perception of sentiment? Does modeling vulgarity explicitly help sentiment prediction? In this study, the researchers compiled a dataset of 6,800 tweets. Next, they had the tweets labeled for sentiment by nine annotators, using a five-point scale. Notably, the data also includes demographics (gender, age, education, income, religious background, and political ideology) of those who posted the tweets. This dataset is one of the only open datasets that not only includes Twitter posts, but detailed information about each poster. Furthermore, this is one of the first-ever studies on how modeling vulgar words could boost sentiment analysis performance. – August, 2018 Published / Last Updated – From the University of Texas at Austin, Isabela Cachola, Eric Holgate, and Junyi Jessy Li. From the University of Pennsylvania, Daniel Preotiuc-Pietro Authors and Contributors Read Now 4. Multilingual Twitter Sentiment Classification: The Role of Human Annotators Out of the papers on sentiment analysis in this list, this is the only study which highlights the importance of human annotators. In this experiment on automated Twitter sentiment classification, researchers from the Jožef Stefan Institute analyze a large dataset of sentiment-annotated tweets in multiple languages. Specifically, the team labeled 1.6 million tweets in 13 different languages. Using these annotated tweets as training data, the team built multiple automatic sentiment classification models. Their experiments resulted in a number of interesting conclusions. Firstly, the researchers state that there is no statistically major difference between the performance of the top classification models. Next, the general accuracy of the classification models does not correlate to performance when applied to the ordered three-class sentiment classification problem. Lastly, they state that it is more efficient to focus on the accuracy of the training data, rather than the type of classification model used. – May 5th, 2016 Published / Last Updated – Igor Mozetič, Miha Grčar, and Jasmina Smailovič, from the Department of Knowledge Technologies at the Jožef Stefan Institute Authors and Contributors Read Now 5. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition In this paper, the authors explain the growing popularity of research around emotion recognition in conversations (ERC). As well, they state there is a lack of large-scale emotional conversational databases in the field. To remedy this, the researchers propose the Multimodal EmotionLines Dataset (MELD), which is an extension and enhancement of the original EmotionLines dataset. MELD includes 13,000 utterances from 1,433 dialogues from the popular TV series . The dataset focuses on dialogues with more than two speakers. Furthermore, each utterance has been annotated with emotion and sentiment labels. The original dataset, EmotionLines, contains only the text of the dialogues. Therefore, it can only be used for textual analysis. The major enhancement of the dataset is the addition of audio and visual modalities. MELD includes the words being said, the tone of voice they are spoken in, and the facial expression held by the speaker. Friends Using this dataset, the researchers establish a strong baseline for emotion recognition in dialogues with more than two speakers. – June 4th, 2019 Published / Last Updated Soujanya Poria (SUTD) , Devamanyu Hazarika (National University of Singapore), Navonil Majumder (National Polytechnic Institute of Mexico) , Gautam Naik (Nanyang Technological University) , Erik Cambria (Nanyang Technological University) , Rada Mihalcea (University of Michigan) Authors and Contributors: Read Now The goal of creating emotionally intelligent machines is an ambitious one. Sentiment analysis and sentiment classification is a necessary step in seeing that goal completed. Hopefully, the papers on sentiment analysis above help strengthen your understanding of the work currently being done in the field. For more reading on sentiment analysis, please see our related resources below. via Marcus Winkler on Unsplash Lead image This article was also published on: https://lionbridge.ai/articles/5-essential-papers-on-sentiment-analysis/

Microsoft

Twitter

5 Must-Read Research Papers on Sentiment Analysis for Data Scientists

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

10 AI and ML Apps, Games, and Tools for Android Phones

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

100 Days of AI, Day 8: Experimenting With Microsoft's Semantic Kernel Using GPT-4

10 AI and ML Apps, Games, and Tools for Android Phones

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

100 Days of AI, Day 8: Experimenting With Microsoft's Semantic Kernel Using GPT-4

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps