Last year we published Empath, a tool for text analysis, which was fortunate enough to win a Best Paper award at CHI. Empath allows researchers to analyze text over a much larger set of categories than are available in existing lexicons (for example, “violence”, “depression”, or “politics”), and it can generate new lexicons on demand using a model based on neural embeddings and crowdsourcing.
We’ve since released Empath as an open source Python library, and we’d love to have more researchers apply it to their work. Given recent discussions in the media about the unprecedented language that President Trump used in his inauguration speech, this seemed a good opportunity to demonstrate what Empath can do.
So, how should we analyze an inauguration speech? There are many possible questions we might ask, but I’m going to focus on how President Trump’s inauguration speech differs from President Obama’s, as he began his first term in 2009. In general, adopting a point of comparison makes lexical analyses easier to interpret. For example, consider the claims: “Trump’s speech is angry” and “Trump’s speech is more angry than Obama’s speech”. The threshold for an angry speech is unclear (this is somewhat philosophical: at what point do we consider a speech angry?), but it is simple to determine whether one speech is more angry than another. In this case, our comparison will ask: how do the signals Empath identifies in Trump’s speech compare with the same signals in Obama’s?
Above, Empath walks over the words in each speech, and counts the number of words that fall into its lexical categories. For example, the word “bleed” would increment categories for hurt and violence, or the word “hope” would increment categories for optimism and positive emotion.
I then imported the resulting category counts into Google Docs and, after a bit of data wrangling, came up with the following chart:
Here the x-axis depicts a normalized word count for each category (the number of words that fall into each category, divided by the the total number of words in the speech).
So, what do we make of this? My immediate reaction is that, in many ways, these speeches are similar. For example, both Trump and Obama use language that strongly signals government, positive emotion, power, strength, and politics. To a lesser extent, both speeches also convey other signals you would expect to see, such as military, economics, work, or terrorism. There is a certain amount of tradition that underlies an inauguration speech, no matter who is giving it.
But the differences between the speeches are also compelling. While both Presidents adopt language of achievement (e.g., “win” or “accomplish”), Trump uses these words much more often than Obama. Similarly, Obama’s speech contained relatively little language of dispute (e.g., “disagree”, “insisit”, or “fight”) or aggression (e.g., “dangerous”, “angry”), and Trump’s speech is much stronger in these signals. On the other hand, Obama’s speech contained an enormous amount of optimism, in comparison with Trump’s, despite similar overall signals for positive emotion.
You’ll find more nuanced considerations of individual passages elsewhere, but here is an excerpt from President Trump’s speech, which I’d consider representative of the overall tone:
Mothers and children trapped in poverty in our inner cities; rusted-out factories scattered like tombstones across the landscape of our nation; an education system flush with cash, but which leaves our young and beautiful students deprived of knowledge; and the crime and gangs and drugs that have stolen too many lives and robbed our country of so much unrealized potential.
This American carnage stops right here and stops right now.
And similarly, an excerpt from President Obama’s:
On this day, we gather because we have chosen hope over fear, unity of purpose over conflict and discord. On this day, we come to proclaim an end to the petty grievances and false promises, the recriminations and worn-out dogmas that for far too long have strangled our politics. We remain a young nation. But in the words of Scripture, the time has come to set aside childish things. The time has come to reaffirm our enduring spirit; to choose our better history; to carry forward that precious gift, that noble idea passed on from generation to generation: the God-given promise that all are equal, all are free, and all deserve a chance to pursue their full measure of happiness.
Now, maybe you’ve already read these speeches; maybe you have your own interpretation. But the key benefit of Empath is that you can discover high-level, lexical signals without actually looking closely at the speeches. Here, that may seem lazy: when you are simply concerned with interpreting two speeches, it’s easy enough to read them both, at length. But as the text you are interested in grows larger—millions of comments on Reddit, for example, or every New York Times article ever published — reading and interpreting all the text yourself becomes impossible to do. That’s when a tool like Empath can step in to aid your analysis.