The goal of our project is to identify people’s detection ability (and their confidence in that ability) to distinguish between human and AI-created content and to share ways to improve people's ability and confidence. We chose to work on this project because AI is starting to get involved in every aspect of our lives, and this, of course, includes our media.
There have been many deepfakes or edits that a lot of people have accepted as the truth, long before AI was part of the picture, and now that it is more prevalent than ever. We are seeing AI used extensively in this movement of disinformation as well. It can be difficult to measure how many people believe disinformation, so we chose to focus on a more niche category: Impersonation.
The majority of people in the United States have celebrities that they know of or follow, with a slightly smaller percentage being avid fans of those people. We figured that in order to get the most accurate information, we needed someone that a lot of people would know of and follow, and we decided to go with Taylor Swift.
A 2023 survey by Morning Consult (Blancaflor & Briggs, 2023) showed that 53% of Americans call themselves fans, with 16% calling themselves “avid fans”. While it can be difficult to link popstar Taylor Swift to protecting ourselves against AI illusions, there is a connection. If fans of Taylor Swift, specifically “avid fans”, are unable to tell the difference between their idol and an AI-generated version of her, then the world is on a path to a lot of trouble. Going back to our goal for this project, we want to be able to spread information that could help battle disinformation and protect people from these AI illusions.
Relevant Work
The article published by the Bowling Green State University Media Newsroom reports on research by student Andrew Samo and Professor Scott Highhouse that examines how well people can tell the difference between AI-generated art and human-made art. In their study, participants were shown artwork created by both humans and an AI system without being told which was which. Samo and Highhouse found that participants correctly identified the source of each piece only about 50 to 60 percent of the time, which is close to random chance.
Their study was not only interested in whether people could classify the artwork. They also wanted to know whether participants preferred human-created or AI-created work. The results showed a clear preference for human artwork. Participants often could not articulate the reason for their preference, but the human pieces produced stronger emotional reactions. Responses to the human artwork scored higher on qualities such as self-reflection and nostalgia, which in turn generated more positive impressions than the AI-generated pieces.
This article is relevant to our project because part of our objective is to evaluate how confident people are in their ability to distinguish between human and AI-created content. Like the BGSU study, our project asked participants to detect the difference between human and AI-generated material, but we used song lyrics instead of visual art. We also expanded on their approach by asking participants to rate their confidence before making their classifications. Another difference in our method is that we revealed the correct answers afterward. This allowed us to measure how their confidence changed once they learned whether they had identified the material correctly or incorrectly.
The study conducted by Fraser et al. also relates to our project because it focuses on detecting text produced by large language models and evaluates how effectively humans can do this. The main difference is that their study examined several technical detection approaches, including watermark detection, statistical and stylistic analysis, language model-based classification, off-the-shelf tools such as CopyLeaks and ZeroGPT, and human evaluation. Fraser et al. found that human detection was the least reliable of these methods, and most judgments were essentially random guesses about whether the text had been generated by an AI system or a human writer.
Project Description
The goal of our project is to measure how well people can detect the difference between human and AI-created content and to understand how confident they feel about that ability. We also aim to identify ways to improve both detection accuracy and confidence. To do this, we first needed to learn how confident our family, friends and classmates were in their ability to tell whether a piece of content was made by a human or by an AI system.
We chose to focus on song lyrics and thought it would be interesting to test whether people could distinguish between lyrics written by Taylor Swift and lyrics generated by an AI chatbot that had been instructed to imitate her writing style. We selected Taylor Swift as the human example because of her large and dedicated fan base and because her new album, “The Life of a Showgirl,” had recently been released at the time of our project.
To gather participants, we created a survey that asked a series of questions. These included questions about their familiarity with Taylor Swift’s music, their confidence in recognizing AI-generated content, and their confidence again after seeing whether their answers were correct.
Results/Findings
Out of the 52 people who completed the survey, respondents reported an average score of 3 on a scale of 1 to 5 when asked how familiar they were with Taylor Swift and how often they listened to her music. When participants were first asked to rate their confidence in their ability to detect the difference between AI-generated content and content created by humans, 2 participants selected a confidence level of 1, 11 selected 2, 13 selected 3, 17 selected 4, and 8 selected 5. The average confidence rating was 3.35. This indicates that, on average, respondents felt moderately confident in their ability to identify whether a work was created by an AI system or by a human before they were tested.
The next question presented two short sets of lyrics. One was taken from an actual Taylor Swift song and the other was written by an AI model instructed to imitate her style. Participants were asked to choose which lyric was the authentic Taylor Swift lyric. Of the 52 responses, 22 participants chose correctly and 29 participants selected the AI-generated lyric instead. This means that slightly fewer than half identified the correct answer.
Immediately after this, the correct answers were revealed, and participants were asked again to rate their confidence in detecting AI-generated content. This time, 6 participants selected a confidence level of 1, 11 selected 2, 18 selected 3, 10 selected 4, and 6 selected 5. The chart below presents these post-assessment confidence ratings alongside the initial confidence scores.
This chart provides a clear visual representation of the change in respondents’ confidence in detecting AI content. In their first self-rating, most participants rated themselves to have a medium-high level of confidence, creating a left-skewed shape in the bar graph. In their second self-rating, however, the number of participants responding with high confidence levels decreased, and the amount of lower confidence levels being reported increased. This shows that after a simple assessment of their ability to determine whether a line of text was AI-generated or created by a human, participants’ confidence in doing so generally decreased.
Impact
The impact we intended this project to have was to help others assess their own ability to determine whether what they see on the internet is content created by actual humans or content generated by AI algorithms, and provide information on how to increase one’s ability to do so.
Although the survey we used to gather our data was simple and concise, it was able to show the majority of survey respondents that AI-generated content may be harder to detect than they had initially believed.
AI tools are improving at an enormous speed, and the ability they have to mimic human speech, artwork, and video content today is uncanny. As AI continues to advance, our ability to identify the differences between AI-generated content and human-created content will continue to diminish, while many people today continue to believe they have sufficient ability in detecting whether the content they see online is AI-generated or not.
This leaves a large portion of society today vulnerable to believing the AI-generated misinformation that plagues the internet, hence, our goal with this project is to help others realize that it is no longer an easy task to say whether an article or a YouTube short is AI-generated or human-created.
Despite the advancements in AI, there are still ways to surmise more accurately whether the content you’re consuming was created by a real human or not.
According to the article 4 Tips for Distinguishing AI-Generated Text from Human-Written Text, one of the most important things to look for when evaluating whether a text was generated by an AI system is any sign of personalized thinking or individual mannerisms from the author. Shehar Yar, CEO of Software House, explains that human writing often contains personal opinions, emotions, and unique perspectives that AI responses do not consistently display. Spencer Christian, founder of Christian Companion App, adds that human text usually shows more variation in sentence structure and carries hints of the writer’s personality. AI-generated text, on the other hand, tends to be more uniform in style, grammar, and structure.
Anna Bernstein, head of prompt engineering at Copy.ai, also notes a key difference in how humans and AI approach longer pieces of writing. Humans typically begin and end large writing projects with statements that reflect a non-linear thought process. This happens because the writer already has a complete mental picture of what they want to express and introduces each idea in a way that fits that overall vision. AI systems, by contrast, develop ideas in a more step-by-step fashion, which creates a more linear and streamlined feel in AI-generated text.
Obviously, just because a video or a passage was created using AI tools does not make it malicious or bad to view. The problem arises when AI content is used to spread misinformation and flood internet sources with false data. This raises the importance of being able to identify this form of content to ensure that the content you consume is genuine and not misleading. Based on the data gathered from our survey, we were able to reach our goal of raising others’ awareness of how difficult it can be to detect AI content and helping them increase their ability to do so.
References
Blancaflor, S., & Briggs, E. (2023, March 14). A demographic deep dive into the Taylor Swift Fandom. Morning Consult. https://pro.morningconsult.com/instant-intel/taylor-swift-fandom-demographic
BGSU Media Newsroom. (2023, December). BGSU research finds people struggle to identify the difference between AI and human art, but prefer genuine human-made works. Bowling Green State University. https://www.bgsu.edu/news/online-media-newsroom/2023/12/bgsu-research-finds-people-struggle-to-identify-the-difference-b.html.
AI Content Managers. (2024, October 3). 4 tips for distinguishing AI-generated text from human-written text. https://aicontentmanagers.com/qa/4-tips-for-distinguishing-ai-generated-text-from-human-written-text/
Fraser, K. C., Dawkins, H., & Kiritchenko, S. (2025). Detecting AI-generated text: Factors influencing detectability with current methods. Journal of Artificial Intelligence Research, 82, 2233–2278. https://doi.org/10.1613/jair.1.16665
