Skip to main content
Communications Psychology logoLink to Communications Psychology
. 2026 Feb 26;4:65. doi: 10.1038/s44271-026-00430-x

Natural language reveals that political partisans are more affectively aligned over political issues than partisan identities

Nakwon Rim 1,2,, Joshua Conrad Jackson 3, Marc G Berman 1,4, Yuan Chang Leong 1,2,4,
PMCID: PMC13062097  PMID: 41748913

Abstract

Affective polarization, defined as the dislike between opposing political groups, is a growing global threat. While much of the focus has been on partisan identities, political divisions may also be driven by affective divergence around political issues, where partisans express opposing feelings toward topics they disagree about. To compare identity-based and issue-based affective alignment, we used word embeddings to analyze two large datasets comprising ~300 million comments from partisan Reddit communities and ~7 million articles from partisan news outlets. We first quantified affective alignment by measuring the valence associations of identity and issue words. In both datasets, affective alignment was greater around political issues than around partisan identities. To validate these findings using a context-sensitive approach, we also used a large language model to rate the valence of identity and issue words in Reddit comments. We again observed stronger affective agreement around issues than identities. These results reveal that even though partisans hold strong negative attitudes toward opposing partisans, the emotional divide around political issues is less pronounced, suggesting opportunities for bridging partisan differences through issue-focused dialog. Our study offers scalable, quantitative tools for understanding the emotional dimensions of political polarization and highlighting pathways to reduce its impact.

Subject terms: Psychology, Human behaviour


Large-scale computational analysis across Reddit comments and news articles finds partisan language to be less affectively divided over political issues than identity labels, suggesting meaningful affective alignment on contentious issues despite partisan animosity.

Introduction

It is easy to find grand claims about the state of political polarization in the United States. Conservatives claim that “Democrats love abortion”1 and “hate the Constitution”2. On the other side of the aisle, liberals assert that “Republicans love guns”3 and “hate immigrants”4. At first glance, these claims are consistent with extensive polling data showing that Republicans and Democrats disagree about abortion, guns, immigration, and a range of other issues5,6. However, these statements go one step further. By using emotionally charged words such as “love” and “hate”, they suggest that liberals and conservatives do not just disagree on policy, but they feel fundamentally different emotions towards these issues. Are liberals and conservatives emotionally divided over the same issues, with one side loving what the other side hates? Or are affective associations with these issues less polarized than we might think? We approach these questions using a natural language approach, which allows us to examine emotional divides in real-world political discourse.

Dissociating identity-based and issue-based affective alignment

Certain forms of emotional divide in the United States are well-established. Affective polarization, defined as the growing hostility toward opposing political groups, is consistently documented across survey and behavioral measures. For example, when conservatives and liberals complete “feeling thermometers,” which measure emotional warmth toward different political groups, they consistently express strong dislike for opposing partisans79. While the emotional divide between partisans is typically discussed in terms of these partisan identities, partisans may also experience similarly polarized emotional responses toward political issues. We refer to the degree to which these emotional reactions to political issues are aligned between partisans as “issue-based affective alignment”, in contrast to “identity-based affective alignment”, where emotional responses are directed toward partisan groups.

There are compelling reasons to expect that affective divergence around political issues may be as pronounced as that surrounding partisan identities. Debates on key political topics, such as gun control and abortion, are not simply about policy differences; they tap into core values, beliefs, and moral frameworks that drive emotional reactions1012. For example, conservatives may have strong positive feelings about gun rights, viewing them as symbols of personal freedom and security, while liberals often view guns primarily through the lens of violence, leading to more negative emotional responses. Similarly, liberals might feel more positive about abortion, viewing it as a matter of reproductive rights and personal choice, while conservatives may experience more negative emotions toward it, seeing it as morally wrong by denying the right to life. These divergent emotional reactions would indicate that partisans not only disagree on policy positions but may also have opposing emotional associations with the issues.

At the same time, there is also reason to believe that issue-based affective alignment is stronger than identity-based affective alignment. Emotional responses to issues can be more nuanced than simply agreeing or disagreeing with a particular policy13,14. While partisans may take opposing positions on policies, they may nevertheless share emotional associations toward these issues. For example, a liberal individual might strongly support access to abortion and the right to choose while still feeling negative about the issue, viewing it as a difficult and emotionally charged decision. Similarly, a conservative individual might support gun rights as a symbol of personal freedom but still feel uneasy about the prevalence of gun violence in society. In this view, the emotional responses attached to issues may not be as starkly divided as their partisan identities.

Understanding whether partisans’ policy disagreements are accompanied by deep emotional divides is crucial for assessing the nature and depth of political polarization in the U.S. Recent evidence suggests that liberals and conservatives may not be as divided as media sources and politicians suggest they are, a phenomenon known as false polarization15,16. If our results show affective alignment for issues is relatively high, our results could be a base for building tools for promoting political compromise and cooperation. On the other hand, if we found that liberals and conservatives are affectively misaligned over a range of political issues, then our evidence would have value for understanding the potential for polarization-reduction initiatives. Since emotions motivate political engagement17,18, shape policy preferences10,11,19, and reduce willingness to compromise20,21, significant affective divergence around issues could entrench policy divides. By investigating the emotional divides around political issues, we can better identify and intervene in the mechanisms that fuel partisan conflict.

Measuring affective associations using large-scale text analyses

As political discourse has increasingly moved online, social media and digital news platforms have become primary spaces for political engagement and information consumption22. Understanding how political discourse evolves on these platforms is therefore critical to uncovering the forces that drive public opinion and political polarization. Recent advances in natural language processing (NLP) provide an opportunity to study political discourse on social media and news platforms at scale. Although polling data is valuable, people’s explicit responses to polls can be susceptible to response biases and demand characteristics23. Analyzing natural language reveals how partisans speak about political issues and partisan identities with each other in real-world contexts over extended periods of time.

Word embedding models have been used to quantify the conceptual associations between words from statistical regularities in natural language24. Recent studies have found that conceptual associations reflected in word embedding models mirror deeply entrenched societal biases, such as those related to gender, race, and other cultural dimensions2531. For example, Caliskan and colleagues25 demonstrated that word embeddings reveal stronger positive associations with European American-sounding names relative to African American-sounding names, echoing patterns of implicit racial bias.

In a similar way, we can apply word embeddings to assess whether words related to political issues (e.g., “firearms,” “abortion”) and partisan identities (e.g., “progressives,” “conservatives”) carry positive or negative valence associations and whether these associations differ depending on whether the text is conservative or liberal-leaning. More recently, large language models (LLMs) have offered a complementary tool to word embedding analysis for analyzing affective content, providing fine-grained judgments about the emotional tone of language in context32. Together, these approaches allow us to quantify affective associations with both political identities and issues, revealing whether affective divergence is primarily centered around partisan groups, political issues, or both.

The present research

In this study, we used the common yardstick of natural language to measure how much partisans are aligned or misaligned in their affective associations of partisan identities and political issues. Specifically, we measured affective alignment by assessing the correlation of affective associations among liberals versus conservatives. For example, if both liberals and conservatives view “police” with similar valence, they are affectively aligned on attitudes towards the police. In contrast, if liberals view the concept of “police” as more negative but conservatives view it as more positive, they are affectively misaligned. Using this framework, we address key questions about how affective divergence manifests in natural language. Do partisans primarily express divergent emotions toward opposing partisan groups, or are these emotional divides also as evident in how they discuss political issues?

We surveyed affective alignment in two distinct contexts: social media and news media. Existing theories point to both news media and social media as potential catalysts of rising political polarization. Some perspectives suggest that social media increases affective polarization because engagement-based social media algorithms amplify negative emotions33,34. Other perspectives claim that news media exacerbate political polarization because news outlets are biased to support conservative or liberal stances on a range of issues, promoting issue polarization3537. By observing our effects across social media communities and news outlets, we can compare how discourse about social identities and political issues plays out across news and social media discussions.

Methods

The Reddit word2vec models

The Reddit comment corpora from 2016 to 2021 were downloaded from the Pushshift Reddit dataset38 on 2022/06/23. From this dataset, we first collected a list of subreddits that were labeled as liberal or conservative (-leaning) by Waller and Anderson39. Waller and Anderson calculated the partisan bias scores based on user/subreddit interaction data (i.e., in what subreddits users commented) without considering the semantic content of the comments. As such, our analyses of the semantic content of the comments were not confounded by the computation of the partisan bias score. Following the approach by the authors, we defined conservative subreddits as those with a partisan bias score of 1 standard deviation higher than the mean across all subreddits (e.g., r/Republican, r/The_Donald), and liberal subreddits as those with a partisan bias score of 1 standard deviation lower than the mean (e.g., r/democrats, r/hillaryclinton).

In addition to the partisan bias, we restricted the analyses to subreddits that discuss political topics. To achieve this, we utilized results from Rajadesingan and colleagues40, who estimated the proportion of political comments for each subreddit. This was done with a classifier trained to distinguish between political and non-political comments on Reddit. Utilizing their results, we restricted our analyses to subreddits where at least 50% of the sampled comments were classified as political in nature.

After identifying subreddits that were ideologically biased and political in nature, we collected all comments from the identified subreddits. We preprocessed the comments in multiple steps, including removing comments from bot accounts, detecting collocations, and removing non-alphanumeric characters (see Supplementary Note 1). After preprocessing, we filtered out subreddits with fewer than 10,000,000 word tokens in total or fewer than 30,000 unique word tokens that appeared at least five times. This was to ensure that there was sufficient data to train the word embedding models. This resulted in 69 subreddits listed in Supplementary Table 1, with 28 conservative subreddits and 41 liberal subreddits. The comments from these subreddits were then used to train the word2vec models, separately for each subreddit. There were, in total, 9,438,650,006 word tokens from 294,476,146 comments across 69 subreddits.

The word2vec models were trained using the Gensim41 package (version 4.3.0) in Python (version 3.9.12). There are numerous hyperparameters that affect the resulting embeddings of a word2vec model. To ensure that our results were not unduly affected by the choice of hyperparameters, we trained 64 different word2vec models for each subreddit using a soft grid search across three hyperparameters (context window size, number of negative words sampled, and downsampling rate of frequent words; see Supplementary Note 1). For each word token, the valence score (see “Building, projecting onto, and validating the valence anchored vector” subsection below) was averaged across 64 models for the main analysis of the paper. The results reported in the paper using Reddit models are robust across models with different hyperparameters evaluated individually.

The News word2vec models

We used pre-trained models from Rozado and al-Gharbi42 for the news corpora analysis, downloaded from https://zenodo.org/record/4797464#.Ys5UpYRBzIw. The authors collected news and opinion articles from news outlets’ websites and public repositories. The articles were preprocessed using steps including removing HTML data, markup syntax, and non-alphanumeric characters. After preprocessing, the authors trained word2vec models on news and opinion articles from 2015 to 2019, individually for each news outlet. Analogous to our Reddit models, the Gensim package in Python was used to train the models. Out of the 47 models, we analyzed 38 models trained on a source that the original authors classified as liberal(-leaning) or conservative(-leaning) based on the media bias rating from AllSides.com Media Bias Chart version 1.143. Allsides.com classifies outlets as “Left”, “Lean Left”, “Moderate”, “Lean Right”, and “Right.” We treated outlets rated as “Left” or “Lean Left” as liberals and outlets rated as “Right” or “Lean Right” as conservatives. In total, there were 15 conservative news outlets and 23 liberal news outlets, which are listed in Supplementary Table 2. There were, in total, 6,749,781 articles across 38 outlets. Additional details on the News models are reported in the original paper42.

The common vocabulary sets

Since each model was trained on a different corpus, each model had a different set of vocabulary. As we are comparing across these models, we utilized word tokens that were common across all models: 14,607 word tokens for Reddit models and 17,456 word tokens for News models.

Building, projecting onto, and validating the valence anchored vector

To calculate the “valence score” of words, we projected vectors of each word onto an “anchored vector”44 corresponding to the valence dimension, following an approach validated in previous work4547. We first defined two sets of word vectors that correspond to each “pole” of the valence dimension. For example, we selected words such as “pleasant”, “happy”, and “joy” to define a positive valence pole, and words such as “unpleasant”, “unhappy”, and “sad” to define a negative valence pole. The words for each pole were selected by the first author by examining all words in the common vocabulary sets and were discussed, revised, and agreed upon by all authors. The selection of words was done before any computation using these words was done. The words to define each pole (33 positive and 40 negative) are listed in Supplementary Table 3.

After finalizing the word lists for each pole, we next calculated the pole vectors. Similar to the approach taken by the distributed dictionary representations (DDR) method48, each pole vector was computed as the average of the normalized word vectors corresponding to the words in the corresponding set. Using the pole vectors, the valence anchored vector was calculated by subtracting the normalized negative valence pole vector from the normalized positive valence pole vector. Finally, we calculated the valence score of the word as the position of a word vector when projected onto the valence anchored vector. This was done by computing the scalar projection of the word vector onto the valence anchored vector.

To validate that the valence scores reflect expected semantic associations, we calculated the valence score of words from NRC-VAD Lexicon49 that were in the common vocabulary set (Reddit models: 7878 words, News models: 9242 words). We then calculated the correlation (Pearson’s r) of the valence rating of the words given by human participants with the calculated valence score. All tests were done separately for each corpus, and only words that appeared in the common vocabulary set but were not in the set of words to define the poles were used. In all corpora, the human valence rating and the valence score were significantly correlated (Reddit models: average r = 0.40, all ps < 0.001, Supplementary Figs. 1A and 2; News models: average r = 0.51, all ps < 0.001, Supplementary Figs. 1B and 3; see Supplementary Figs. 2 and 3 for r and p value for each corpora).

Selection of issue and identity words

To operationalize issue-based and identity-based affective alignment, we first identified words closely associated with political issues and partisan identities. The research team selected seven issue domains commonly examined in studies of U.S. political polarization: abortion, immigration, the Constitution, guns, the LGBTQ+ community, police and criminals, and religion. For partisan identities, we included the two major political groups in the United States: Republicans and Democrats. Consistent with the procedure used to define the valence vector, candidate words for each domain were initially compiled by the first author from the common vocabulary sets. These lists were then reviewed, revised, and finalized through consensus among all authors. The resulting sets comprised 118 issue words (Supplementary Table 4) and 36 identity words (Supplementary Table 5).

Calculating affective alignments between corpora

For a corpora pair, we calculated the “affective alignment” of issue and identity words by calculating the rank correlation (Spearman’s ρ) of the valence scores of issue and identity words, respectively. When the affective alignment was calculated between corpora pairs with opposing partisan bias (i.e., pairs of one conservative corpus and one liberal corpus), we refer to this as “cross-partisan affective alignment”. Similarly, we refer to affective alignment between corpora pairs that share the partisan bias (i.e., pairs of one conservative corpus and another conservative corpus or of one liberal corpus and another liberal corpus) as “within-partisan affective alignment”.

Comparing cross-partisan affective alignments

We compared the cross-partisan affective alignment of issue and identity words using a permutation test. We first calculated the absolute mean difference in cross-partisan affective alignment between issue and identity words. We then randomly swapped the values associated with issue and identity words within each corpus pair and recalculated the absolute mean difference 10,000 times to generate a null distribution. Statistical significance was assessed by testing whether the observed absolute mean difference exceeded the null distribution, with the p-value computed as (1 + number of null values that are the same or larger than the tested value)/10,001. This process was done separately for the Reddit and News models. The p-value calculation method is used in all other permutation tests in this paper.

Comparing within-partisan and cross-partisan affective alignments

We compared the within-partisan and cross-partisan affective alignment using a permutation test. We first calculated the absolute mean difference between within-partisan and cross-partisan affective alignment. We then compared the observed absolute mean difference to a null distribution generated by randomly permuting partisan labels (liberal or conservative) of the corpora and recalculating the metric 10,000 times. This permutation test was done separately for the Reddit and News models, and for issue and identity words. We additionally tested whether the gap between cross-partisan and within-partisan alignment was larger for identity words than for issue words using a permutation test. The null distribution was generated by swapping the identity and issue values within each corpus pair 10,000 times.

Comparing affective alignment of partisan models with pretrained models

Using the same procedures described above, we first computed valence scores for the identity and issue words in three widely used pretrained word embedding models: the word2vec model trained on the Google News corpus24, the GloVe model trained on the Twitter corpus50, and the fastText model trained on Common Crawl and Wikipedia corpora51,52. Only the words appearing in each of the models were used for the valence score calculation. We then calculated the affective alignment between each pretrained model and partisan model by computing the rank correlation between their valence scores for the issue and identity words. Finally, we tested whether the observed absolute mean difference in affective alignment between issue and identity words was statistically greater than expected under the null. Specifically, we used a permutation test where the null distribution was generated by randomly swapping issue and identity values within each partisan-pretrained model pair 10,000 times.

Classifying corpus partisanship using affective alignments

We used a similarity-based classifier53,54 to assess if the valence scores of issue and identity words were sufficient to disambiguate the corpus partisanship. A leave-one-out cross-validation framework was used to calculate the classification accuracy, separately for issue and identity words. For each cross-validation fold, we held out the valence scores of one corpus and calculated the average valence score for each word separately for the remaining liberal and conservative corpora. We then calculated the rank correlation between the valence scores of the held-out model and the liberal and conservative averages. If the correlation was higher with the conservative average, the held-out model was classified as a conservative model, and vice versa. In other words, the held-out model was classified based on the affective alignment with the average valence scores of conservative and liberal models. This process was repeated, holding out a different model in each fold. The classification accuracy was then averaged across all cross-validation folds. Finally, we tested whether the observed classification accuracy was statistically greater than expected under the null. Specifically, we used a permutation test in which the null distribution was generated by randomly permuting corpus labels and recalculating classification accuracy 10,000 times.

Comparing within-partisan and cross-partisan affective disagreements using LLaMA 3.3

In addition to the word embedding models, we used the large language model LLaMA 3.355 to estimate valence ratings for Reddit comments referencing two issue words expected to have higher valence to conservatives (“religion” and “gun”), two issue words expected to have higher valence to liberals (“abortion” and “immigrant”) and two identity words (“republican” and “democrat”). LLaMA 3.3 is an instruction-tuned large language model with approximately 70 billion parameters. We used the model and corresponding tokenizer released by Meta (meta-llama/Llama-3.3-70B-Instruct), accessed via Hugging Face (https://huggingface.co). The model was loaded with 8-bit quantization and run with bfloat16 precision. Due to computational constraints, it was not feasible to analyze all comments containing these words in our corpus (which includes over 20 million such comments). Therefore, we sampled 440,033 comments containing one of the target words for analysis, stratified by time period (month) and subreddit (total number of sampled comments: “abortion” = 61,368, “immigrant” = 65,875, “religion” = 69,239, “gun” = 78,321, “republican” = 82,674, “democrat” = 82,556).

For the sampled comments, we used LLaMA 3.3 to estimate the valence rating of the corresponding target word within each comment. We constructed a system prompt that explained the concept of valence and the meaning of a 1–5 Likert-scale rating, and instructed the model to output only a single score (“You are a social psychology research assistant trained to rate the valence toward certain topics that the text is showing. Valence is the pleasantness or unpleasantness of an emotional stimulus, in this case expressed through text. Use a 1–5 Likert scale where 1 = very unpleasant, 2 = unpleasant, 3 = neutral, 4 = pleasant, 5 = very pleasant.\nConsider the valence toward the topic, not just the valence of the overall text.\nProvide only the score.”). This system prompt, along with a user prompt with the target word and the full comment (“What is the valence toward {target word} in the following text?\n {comment}”), was provided as input to the LLaMA model. We extracted the model’s predicted probabilities for each of the tokens “1”, “2”, “3”, “4”, and “5” as the immediate next token. These probabilities were then normalized by dividing each by the total probability across the five tokens. Finally, we computed a weighted average of the ratings, where each rating was weighted by its corresponding normalized probability, as the final valence rating.

Using the valence ratings for each comment, we averaged the ratings by target word and subreddit to assess the mean valence toward each target word within each subreddit. The observed absolute difference of the mean valence of each target word between liberal and conservative subreddits was compared to a null distribution generated by permuting the partisan labels of the subreddits 10,000 times.

Finally, we calculated the affective disagreement of a word between two subreddits as the absolute difference of the mean valence rating of the word. Analogous to the word2vec analysis, the affective disagreement between subreddits with opposing partisan bias is referred to as cross-partisan affective disagreement, while the affective disagreement between subreddits that share the partisan bias is referred to as within-partisan affective disagreement. We compared within-partisan and cross-partisan affective disagreement using the same permutation procedure as in the word2vec affective alignment analyses. We first calculated the observed absolute mean difference in affective disagreement between within-partisan and cross-partisan pairs. We then randomly permuted subreddit labels 10,000 times and recalculated the metric to generate a null distribution, against which the observed absolute mean difference was compared. This was done separately for each target word. We additionally tested whether the gap between cross-partisan and within-partisan disagreement was greater for the issue word than for the identity word, separately for each issue-identity word pair. This was done by a permutation test, where the null distribution was generated by swapping the values for the identity and issue words 10,000 times.

Ethical framework

The research procedure was approved by the University of Chicago Institutional Review Board.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Results

We used word2vec24 models, a commonly used word embedding model, as the main tool of analysis. The first set of models (“Reddit models”) was trained on user comments on Reddit between 2016–2021. We restricted our analyses to communities (“subreddits”) that were political in nature40 and exhibited a strong liberal (n = 41) or conservative (n = 28) partisan bias according to a previous large-scale analysis of Reddit39. The second set of models (“News models”) was trained on news and opinion articles from 38 print media outlets between 2015 and 2019 by Rozado and al-Gharbi42. The outlets were grouped into conservative (n = 15) and liberal (n = 23) outlets based on the Media Bias rating provided by Allsides.com43. Fig. 1 gives a pictorial overview of the corpora, and Supplementary Tables 1 and 2 list the corpora.

Fig. 1. Overview of the corpora.

Fig. 1

A Word embedding models were trained on sixty-nine subreddit communities. Blue and red circles denote the liberal and conservative subreddits, respectively. The x-axis denotes the partisan bias scores calculated by Waller and Anderson39, where positive numbers indicate conservative leanings while negative numbers indicate liberal leanings. Subreddits with a partisan bias score of 1 standard deviation above the mean across all subreddits were categorized as conservative subreddits, while those with a partisan bias score of 1 standard deviation below the mean were categorized as liberal. The y-axis denotes the number of comments in each subreddit. B Word embedding models were trained on thirty-eight news outlets by Rozado and al-Gharbi42. The x-axis denotes the partisan bias scores given by Allsides.com43. Outlets with a partisan bias score less than or equal to -1 were categorized as liberal outlets, and outlets with a partisan bias score greater than or equal to 1 were categorized as conservative outlets. The y-axis denotes the number of articles for each outlet.

Cross-partisan affective alignment is greater for political issues than for partisan identity across media platforms

Word embedding models represent individual words as vectors in a high-dimensional vector space. Within the embedding space of each model, we constructed an “anchored vector”44 corresponding to the valence dimension (negative to positive) by subtracting the average vector for a set of negative words (e.g., “unpleasant,” “unhappy,” and “sad”; see Supplementary Table 3) from the average vector for a set of positive words (e.g., “pleasant,” “happy,” and “joy”). From this valence anchored vector, we calculated a “valence score” for each word that we can compare between the liberal and conservative corpora (see “Methods” and Supplementary Figs. 13 for metric calculation and validation). We ran our analyses for two sets of words. The first set consisted of 118 issue words that relate to 7 politically polarized topics: abortion, constitution, guns, immigration, the LGBTQ+ community, police and criminals, and religion (see Supplementary Table 4). The second set consisted of 36 identity words that relate to partisan social identities: Republican and Democrat (see Supplementary Table 5).

Our first set of analyses directly compared the magnitude of identity-based and issue-based cross-partisan affective alignment, or the similarity in the rank order of valence scores of identity or issue words between partisan groups. As an illustration, Fig. 2A, B show the valence scores of a subset of issue and identity words in a conservative (r/Republican) and a liberal (r/democrats) subreddit. Figure 2C shows the correlation between valence scores of all issue and identity words for the two subreddits. Here, issue words have similar valence scores across the two subreddits, indicating relatively high affective alignment. In contrast, valence scores of identity words diverged more between the subreddits, indicating relatively low affective alignment. We used the affective alignment between every pair of subreddits with a different political bias as a measure of cross-partisan affective alignment.

Fig. 2. Comparison of the cross-partisan affective alignments.

Fig. 2

Valence score of a subset of A issue words (gray shapes) and B identity words (red and blue shapes) in a conservative (r/Republican; red line) and a liberal (r/democrats; blue line) subreddit. The same scale was used for the x-axis across (A, B). C Correlation of valence scores of issue (left panel) and identity (right panel) words between the two subreddits. Data points indicate individual words. Unique shapes correspond to the words shown in (A, B). Circles indicate all other issue words (left panel) and identity words (right panel). In the right panel, red and blue data points indicate words related to Republicans and Democrats, respectively. D Cross-partisan affective alignment is higher for issue words than identity words. The left panel shows the results for Reddit models (n = 1148 subreddit pairs), and the right panel shows the results for News models (n = 345 outlet pairs). In both panels, the left category shows cross-partisan affective alignments of issue words, and the right category shows those of identity words. Half-violins show Gaussian kernel-density estimates, and the box plots indicate the median (center line), interquartile range (box), and 1.5× interquartile range (whiskers). Each gray line represents a cross-partisan corpus pair. The black circles, connected by the black line, show the mean affective alignment across all cross-partisan corpus pairs. ***p < 0.001.

On average, cross-partisan affective alignment was higher for issue words than identity words (average ρissue = 0.69, average ρidentity = 0.37, average difference = 0.32, 95% bootstrap CI [0.31, 0.33], permutation test p < 0.001). The same pattern of results was observed in the News corpora (average ρissue = 0.71, average ρidentity = 0.44, average difference = 0.27, 95% bootstrap CI [0.25, 0.29], permutation test p < 0.001; Fig. 2D). In other words, across both Reddit communities and news outlets, cross-partisan affective alignment was greater for political issues than partisan identities, suggesting that partisans are more emotionally divided on their attitudes toward each other than they are on issues.

Partisans were affectively misaligned over political issues on Reddit

Our prior results, however, do not imply that liberals and conservatives were in complete affective alignment on political issues. For example, Fig. 2A shows small deviations where the word “abortion” is relatively more positive in the liberal subreddit, and relatively more negative in the conservative subreddit, consistent with known partisan positions towards the issue56. Are affective associations more aligned within partisan groups than between them? To test this possibility, we compared the within-partisan affective alignment against the cross-partisan alignment separately for each set of words.

We computed the within-partisan affective alignment as the rank correlation between the valence scores of subreddit pairs with the same political leaning. Higher within-partisan affective alignments compared to cross-partisan affective alignments would suggest that subreddits with the same political leaning tended to share similar affective associations. On average, within-partisan affective alignment was higher than cross-partisan affective alignment for both issue words (average ρwithin = 0.72, average ρcross = 0.69, average difference = 0.03, 95% bootstrap CI [0.02, 0.05], permutation test p < 0.001) and identity words (average ρwithin = 0.50, average ρcross = 0.37, average difference= 0.13, 95% bootstrap CI [0.04, 0.23], permutation test p < 0.001; Fig. 3A), suggesting that partisans on Reddit were affectively misaligned over both partisan identities and political issues. However, even though both identity and issue words showed a significant misalignment, the gap between cross-partisan and within-partisan affective alignment was still larger for identity words than for issue words (average difference = 0.10, 95% bootstrap CI [0.08, 0.11], permutation test p < 0.001; Supplementary Fig. 4A).

Fig. 3. Comparison of within-partisan and cross-partisan affective alignments.

Fig. 3

A Within-partisan affective alignment is greater than cross-partisan affective alignment for both issue and identity words in the Reddit models. The left panel shows the results for issue words, and the right panel shows the results for identity words. In both panels, the left category shows within-partisan subreddit pairs (n = 1198), and the right category shows cross-partisan subreddit pairs (n = 1148). Half-violins show Gaussian kernel-density estimates, and the box plots indicate the median (center line), interquartile range (box), and 1.5× interquartile range (whiskers). Each gray circle represents a corpus pair. The black circles, connected by the black line, show the average affective alignment across all corpus pairs. B Within-partisan affective alignment is greater than cross-partisan affective alignment for identity words in the News models. The plot elements are identical to (A), but for the News models. In both panels, the left category shows within-partisan outlet pairs (n = 358), and the right category shows cross-partisan outlet pairs (n = 345). ***p < 0.001.

For the News models, within-partisan affective alignment was higher than cross-partisan affective alignment for identity words (average ρwithin = 0.56, average ρcross = 0.44, average difference = 0.12, 95% bootstrap CI [0.06, 0.20], permutation test p < 0.001), but we did not find a significant difference for issue words (average ρwithin = 0.72, average ρcross = 0.71, average difference = 0.02, 95% bootstrap CI [0.01, 0.03], permutation test p = 0.132; Fig. 3B). Furthermore, the gap between cross-partisan and within-partisan alignment was larger for identity words than for issue words (average difference = 0.10, 95% bootstrap CI [0.08, 0.13], permutation test p < 0.001; Supplementary Fig. 4B).

As words related to the LGBTQ+ community may also represent a political identity with partisan relevance, we repeated all analyses reported above after removing these words from the issue word list. The overall pattern of results remained unchanged: issue-based affective alignment remained consistently higher than identity-based alignment across both Reddit and News models, and we again observed significant difference between within-partisan and cross-partisan affective alignment in both identity and issue words in Reddit, but only found a significant difference in identity words in the News models (see Supplementary Note 2). This indicates that the observed distinction between identity- and issue-based affective alignment is robust to the exclusion of LGBTQ+ community words from the issue category.

Overall, our results indicate that affective alignment was stronger for political issues than for partisan identities, consistent with the cross-partisan affective alignment findings. At the same time, partisans on Reddit still exhibited affective misalignment on political issues, showing that affective associations for political issues can diverge between political groups, though to a lesser extent than those for partisan identities.

Affective alignment between partisan and pretrained models is greater for political issues than for partisan identities

In addition to comparing the magnitude of identity-based and issue-based affective alignment between liberal and conservative corpora, we also examined how aligned affective associations in partisan discourse are with general discourse. Specifically, we compared the valence scores of identity and issue words in partisan corpora to those in large-scale, general corpora that were not derived from explicitly political communities. This analysis allows us to assess whether affective meanings surrounding political identities and issues differ not only between partisans but also from the broader linguistic baseline, and whether this divergence is greater for partisan identity or political issues.

We first calculated the valence score of issue and identity words in three widely used word embedding models trained on large, general corpora that differ in data source and architecture: (1) the word2vec model trained on the Google News corpus24, (2) the GloVe model trained on the Twitter corpus50, and (3) the fastText model trained on Common Crawl and Wikipedia corpora51,52. We then computed the affective alignment between each of these general-language models and the partisan models using the same rank-correlation approach described earlier.

Echoing the results from comparing liberal and conservative corpora, the affective alignment between partisan Reddit models and Google News word2vec models was higher for issue words than for identity words (average ρissue = 0.63, average ρidentity = 0.19, average difference = 0.45, 95% bootstrap CI [0.40, 0.49], permutation test p < 0.001), and the same pattern held when comparing against the GloVe (average ρissue = 0.54, average ρidentity = 0.39, average difference= 0.15, 95% bootstrap CI [0.09, 0.21], permutation test p < 0.001) or fastText models (average ρissue = 0.70, average ρidentity = 0.44, average difference= 0.25, 95% bootstrap CI [0.21, 0.30], permutation test p < 0.001; Fig. 4A). The pattern was also consistent when comparing partisan News models with pretrained models (Google News word2vec: average ρissue = 0.56, average ρidentity = 0.33, average difference= 0.22, 95% bootstrap CI [0.17, 0.28], permutation test p < 0.001; GloVe: average ρissue = 0.45, average ρidentity = 0.11, average difference = 0.35, 95% bootstrap CI [0.28, 0.41], permutation test p < 0.001; fastText: average ρissue = 0.62, average ρidentity = 0.22, average difference = 0.40, 95% bootstrap CI [0.35, 0.45], permutation test p < 0.001; Fig. 4B). These results suggest that valence associations in partisan discourse diverge more strongly from those of general language for partisan identities than for political issues, providing additional evidence that affective alignment is greater for political issues than for partisan identities.

Fig. 4. Comparison of the affective alignments between partisan and pretrained models.

Fig. 4

A Affective alignment of issue words is greater than affective alignment of identity words when comparing partisan Reddit models with pretrained models. The left, middle, and right columns show results using word2vec trained on Google News, GloVe trained on Twitter, and fastText trained on Common Crawl and Wikipedia, respectively, as the pretrained models. In all panels, the left category shows affective alignments of issue words, and the right category shows the affective alignments of identity words. Half-violins show Gaussian kernel-density estimates, and the box plots indicate the median (center line), interquartile range (box), and 1.5× interquartile range (whiskers). Each gray line represents a partisan-pretrained model pair (n = 69 per panel). The black circles, connected by the black line, show the mean affective alignment across all pairs. B Affective alignment of issue words is greater than affective alignment of identity words when comparing partisan News models with pretrained models. The plot elements are identical to (A), but for the News models (n = 38 partisan-pretrained model pairs per panel). ***p < 0.001.

Predicting corpus partisanship from affective alignment of partisan identities and political issues

To further assess the divergence in affective associations between liberal and conservative corpora, we used a classification approach to determine whether affective alignment could reliably indicate the partisanship of a corpus. Specifically, we adopted a similarity-based classification approach53,54, which categorized conservative and liberal corpora based on whether the valence scores of the words in the corpora were more similar to the average conservative or liberal corpora. The classification was performed following a leave-one-out cross-validation framework, separately for issue and identity words.

For the Reddit models, classification accuracy was above chance when relying on the valence scores of both identity words (63/69 correct, accuracy = 91.30%, 95% Wilson CI [82.30%, 95.95%], permutation test p < 0.001) and issue words (60/69 correct, accuracy = 86.96%, 95% Wilson CI [77.03%, 92.98%], permutation test p < 0.001; Fig. 5A). This indicates that on Reddit, affective associations of both identity and issue words were sufficiently consistent within partisan groups and distinct between partisan groups such that it was possible to reliably distinguish between the partisan communities.

Fig. 5. Leave-one-out-cross-validation classification results.

Fig. 5

A Affective alignment of both issue and identity words can classify the partisanship of Reddit models. The colored horizontal lines represent the observed accuracies of classifying the Reddit models. The histogram shows null distributions from permuting the partisan labels. All histograms are based on 10,000 samples. The lines and histograms for the issue words are shown on the left, while the line and histograms for identity words are shown on the right. B Affective alignment of identity words can classify the partisanship of News models. The plot elements are identical to (A), but for the News models. ***p < 0.001.

For the News models, classification accuracy was above chance when relying on the valence scores of identity words (33/38 correct, accuracy = 86.84%, 95% Wilson CI [72.67%, 94.25%], permutation test p < 0.001), but was not significantly higher than chance for issue words (25/38 correct, accuracy = 65.79%, 95% Wilson CI [49.89%, 78.79%], permutation test p = 0.082; Fig. 5B). Altogether, the classification results provide converging evidence of identity-based affective misalignment across both Reddit and news media, with evidence of issue-based affective misalignment primarily on Reddit.

LLM-based sentiment ratings reveal stronger affective disagreement around partisan identities than political issues

Static word embeddings, such as word2vec, can capture general affective associations in the corpora, but lack contextual sensitivity for individual sentences. For example, in the sentence “People do not happily celebrate after having an abortion,” positive affect words such as “happily” or “celebrate” appear near “abortion”. This may bias the embedding model toward a positive valence score to “abortion,” even though the intended sentiment is negative. In other words, static embeddings assign a single vector to a word and might not be sensitive to variation in valence that depends on sentence-level meaning. To address this limitation and validate the robustness of our results, we adopted a more context-aware approach using a large language model (LLM). Specifically, we used LLaMA 3.355, an open-source, pre-trained LLM, to estimate the sentiment expressed toward specific identity and issue words within comment contexts. This approach allowed the estimated sentiment to reflect the meaning communicated in each comment, which provides a more precise and less noisy measure of valence than estimates based solely on aggregated co-occurrence patterns. We focused this analysis on Reddit, as the full text of individual comments was available, unlike in the news corpora, where only word embeddings were accessible.

We selected two issue words expected to be more positive to conservatives (“religion” and “gun”), two issue words expected to be more positive to liberals (“abortion” and “immigrant”), and two identity words (“republican” and “democrat”). We then sampled 440,033 comments containing one of these six target words, stratified across time periods and subreddits. Next, we used LLaMA 3.3 to assess the word’s sentiment in each comment by prompting “What is the valence toward {target word} in the following text?” on a 1–5 Likert scale from very unpleasant to very pleasant, and deriving the rating from next-token probabilities (Fig. 6A; see “Methods”). Finally, we averaged ratings across sampled comments by target word and subreddit to assess valence toward each target word within each subreddit.

Fig. 6. Comparison of within-partisan and cross-partisan affective disagreements.

Fig. 6

A Procedure for estimating valence ratings from comments. The LLaMA 3.3 model was prompted with instructions, the target word, and the comment text. The model’s predicted probabilities for tokens “1”, “2”, “3”, “4”, and “5” as the immediate next token were extracted. A weighted average of these ratings, using the corresponding probabilities as weights, was calculated as the final valence rating. B Cross-partisan affective disagreement is greater than within-partisan affective disagreement for identity words and a subset of issue words. The left and middle columns show results for issue words, and the right column shows the results for the identity words. In all panels, the left category shows within-partisan subreddit pairs (n = 1198), and the right category shows cross-partisan subreddit pairs (n = 1148). Half-violins show Gaussian kernel-density estimates, and the box plots indicate the median (center line), interquartile range (box), and 1.5× interquartile range (whiskers). The gray circles represent the absolute difference in average valence rating for each corpus pair. The black circles, connected by the black line, show the average value across all corpus pairs. *p < 0.05, ***p < 0.001.

We first tested whether average valence ratings for identity and issue words differed between liberal and conservative subreddits. As expected, the identity word republican had higher mean valence in conservative subreddits than in liberal ones (Mcon = 1.90, Mlib = 1.67, average difference = 0.22, 95% bootstrap CI [0.15, 0.29], permutation test p < 0.001), while “democrat” showed the opposite pattern (Mcon = 1.95, Mlib = 2.24, average difference = 0.29, 95% bootstrap CI [0.19, 0.40], permutation test p < 0.001). We observed a similar pattern for two issue words expected to be more positively associated among conservatives (“religion”: Mcon = 2.12, Mlib = 2.03, average difference = 0.09, 95% bootstrap CI [0.00, 0.18], permutation test p = 0.042; “gun”: Mcon = 1.74, Mlib = 1.66, average difference = 0.08, 95% bootstrap CI [0.02, 0.15], permutation test p = 0.012). The word “abortion” had higher valence in liberal subreddits than in conservative ones (Mcon = 1.85, Mlib = 1.96, average difference = 0.11, 95% bootstrap CI [0.03, 0.19], permutation test p = 0.015), also aligning with expectations. However, we did not find a significant difference in valence for “immigrant” between the two groups (Mcon = 2.12, Mlib = 2.16, average difference = 0.05, 95% bootstrap CI [−0.05, 0.14], permutation test p = 0.358). All results described here are displayed in Supplementary Fig. 5. Overall, these results suggest that LLM valence ratings generally reflected expected partisan differences in affective associations, supporting their use as a complementary, context-sensitive measure alongside word embeddings.

We next computed within-partisan and cross-partisan affective disagreement for each word by measuring the average absolute difference in mean valence ratings of subreddit pairs. Consistent with the word embedding analysis, cross-partisan disagreement was significantly greater than within-partisan disagreement for both identity words (“republican”: average |d|within = 0.16, average |d|cross = 0.25, average difference = 0.09, 95% jackknife CI [0.03, 0.15], permutation test p < 0.001; “democrat”: average |d|within = 0.24, average |d|cross = 0.34, average difference = 0.10, 95% jackknife CI [0.03, 0.16], permutation test p < 0.001). Among issue words expected to be more positively valenced among conservatives, we observed small but significant differences (“religion”: average |d|within = 0.20, average |d|cross = 0.22, average difference = 0.02, 95% jackknife CI [−0.02, 0.05], permutation test p = 0.022; “gun”: average |d|within = 0.14, average |d|cross = 0.16, average difference = 0.02, 95% jackknife CI [−0.01, 0.05], permutation test p = 0.028). However, for issue words expected to be more favorable to liberals, we did not find significant differences between within-partisan and cross-partisan disagreement (“abortion”: average |d|within = 0.21, average |d|cross = 0.21, average difference = −0.00, 95% jackknife CI [−0.03, 0.03], permutation test p = 0.966; “immigrant”: average |d|within = 0.24, average |d|cross = 0.23, average difference = −0.01, 95% jackknife CI [−0.02, 0.01], permutation test p = 0.545; Fig. 6B).

Crucially, across all comparisons, identity words showed significantly larger gap between cross- and within-partisan disagreement than issue words (“republican” - “religion”: average difference = 0.07, 95% bootstrap CI [0.06, 0.08], permutation test p < 0.001; “republican” - “gun”: average difference = 0.07, 95% bootstrap CI [0.06, 0.09], permutation test p < 0.001; “republican” - “abortion”: average difference = 0.09, 95% bootstrap CI [0.07, 0.10], permutation test p < 0.001; “republican” - “immigrant”: average difference = 0.08, 95% bootstrap CI [0.07, 0.10], permutation test p < 0.001; “democrat” - “religion”: average difference = 0.08, 95% bootstrap CI [0.05, 0.10], permutation test p < 0.001; “democrat” - “gun”: average difference = 0.08, 95% bootstrap CI [0.06, 0.10], permutation test p < 0.001; “democrat” - “abortion”: average difference = 0.10, 95% bootstrap CI [0.07, 0.11], permutation test p < 0.001; “democrat” - “immigrant”: average difference = 0.09, 95% bootstrap CI [0.06, 0.11], permutation test p < 0.001; Supplementary Fig. 6). These results replicate our findings from the word embedding analysis and provide convergent support for our conclusion that affective alignment is stronger for political issues than partisan identities.

Discussion

Are partisans in the United States emotionally divided on issues as much as they are divided on how they feel about each other? One possibility is that partisans experience divergent emotions about political issues such as immigration, the police, and abortion, which in turn can exacerbate partisan divisions. Another possibility is that affective alignment is relatively strong in the domain of political issues, at least compared to affective alignment on the basis of partisan identity. Here, we tested these possibilities using large corpora that recorded how people discussed politics in the news and on social media. We found that, across both platform types, conservatives and liberals show greater affective alignment along political issues than along partisan identity. Partisans on Reddit were affectively misaligned along both partisan identity and political issues, while in news media, we only found significant affective misalignment around partisan identities.

Our research builds on a growing body of studies examining political polarization through computational content analysis, which have enabled large-scale analyses of how traits, values, and emotions are expressed in political communication5760. Prior computational content analyses of affective polarization have largely focused on detecting sentiment, either in general or directed toward political groups, revealing how partisans express hostility toward outgroups42,61. However, these approaches have paid less attention to affect directed at substantive issues that polarize partisans. We extend this work by distinguishing between identity-based and issue-based affective alignment and by quantifying each across both social and news media.

Prior research on political issues has primarily examined partisans’ policy positions, or whether they support or oppose specific policies6. However, policy positions do not necessarily capture the underlying emotional responses that partisans have toward these issues, leaving it unclear whether their feelings about issues are as divergent as their opinions. Our findings revealed that across both Reddit and news media, issue words exhibited relatively high cross-partisan affective alignment, challenging common partisan narratives that exaggerate emotional differences. Whereas political pundits often make claims about how Democrats “love abortion” and how Republicans “hate immigrants,” our results suggest a higher degree of affective alignment on these issues. For example, the word “abortion” was associated with relatively negative valence in both conservative and liberal subreddits. Our results thus highlight a potential disconnect between how partisans feel about key issues and how those emotions are portrayed in political rhetoric.

Whereas issue-based affective alignment was stronger than identity-based affective alignment in both social and news media, issue-based affective misalignment was still detectable on social media. Reddit employs user-driven content moderation (e.g., upvotes/downvotes) and engagement-based algorithms (e.g., posts with more comments may be more visible), which can amplify content that resonates emotionally with partisan communities33,34. These features could make partisans engage more frequently with affectively charged content that aligns with their views. The presence of issue-based affective misalignment on Reddit suggests that social media dynamics may broaden the scope of emotional divergence beyond partisan identities to include political issues themselves.

While issue-based affective misalignment was relatively weak, we found strong evidence of identity-based affective misalignment across both platforms. Prior work has demonstrated affective polarization over partisan identities primarily through self-report surveys, where participants rate their feelings toward members of opposing political groups79. Our findings extend this prior work by showing that affective polarization is not just present in self-reported attitudes but also emerges organically in how people discuss politics freely in online spaces and news media. Unlike the private responses to survey questions, Reddit comments and news articles are publicly visible and read by many others. Language on these forums can therefore serve as a mechanism through which affective polarization is further entrenched via social influence and ideological sorting62,63.

The main methodological approach employed in our work, analyzing large-scale text data through word embeddings, offers a valuable framework for examining distinct emotional, ideological, and social dimensions of political polarization. First, our method is low-cost compared to traditional survey methods, which often require extensive resources for data collection and participant recruitment. Second, the same model can also be used to explore different research questions without new rounds of data collection. For example, the method can be extended to study other dimensions of polarization, such as those along moral values64,65 or perceived threats66,67. Thus, the embedding models offer a generalizable method for studying a wide range of corpora across different online forums, social media platforms, and even offline dialogs, providing a common approach for researchers to study polarization across contexts. Furthermore, by training models on corpora from different time periods, our approach can also be used to study how patterns of polarization evolve with cultural changes and important events26. More broadly, insights gained from our computational approach can complement existing research methods by generating hypotheses that can subsequently be tested through surveys and experiments.

To complement our word embedding analysis, we introduced an LLM-based approach to estimate sentiment in context. While embedding-based models are efficient and scalable for analyzing large corpora, they treat word meanings as fixed and independent of context. In contrast, LLMs can incorporate the surrounding linguistic context to better assess how a given word is used in a specific sentence. Due to computational constraints, we limited this analysis to a subset of Reddit comments and a small number of identity and issue words. Nevertheless, we found convergence between the two methods, where both indicated stronger affective alignment around political issues than partisan identities, which enhances the robustness of our conclusions.

Limitations

One limitation of our approach is that it does not capture the underlying reasons behind affective alignment. For example, a liberal and a conservative might both express negative sentiment toward abortion, but the focus of their negativity may differ. A liberal might be attuned to the challenges faced by the mother, whereas a conservative might focus on the embryo and the moral implications of the right to life. These differences in the causes of emotion could make finding consensus difficult, even when affective alignment is present. However, if partisans share similar affective responses, these emotional commonalities could nevertheless serve as a valuable starting point for reducing polarization. For instance, recognizing that liberals do not “love abortion” and that both sides see it as a difficult and emotionally charged issue may help shift discourse away from divisive rhetoric. Similar patterns have been observed in online mental health communities, where Republicans and Democrats, despite exhibiting distinct cultural communication styles, share overlapping expressions of distress and vulnerability that can foster moments of cross-partisan empathy68. Acknowledging shared sentiment, rather than framing the debate in terms of absolute opposition, could shift political discourse toward areas of mutual concern, such as expanding support for pregnant individuals.

Another important limitation of this work concerns the representativeness of online discourse. The affective patterns observed in Reddit and news corpora may not fully reflect “real-life” political emotions or the broader distribution of public sentiment. While Reddit constitutes a substantial portion of online discourse, its structure promotes anonymity, which can encourage both extreme and performative expressions of emotion. Furthermore, its user base tends to be younger and more male than the general population. The inclusion of articles from news outlets provides a useful point of comparison that is less affected by these biases, but both sources still capture only a subset of political communication. Accordingly, our results should be interpreted as reflecting affective dynamics within these specific online contexts rather than as a direct estimate of population-level polarization. As previously mentioned, future research could extend this approach to other platforms, languages, and offline conversational settings to assess how robust these affective patterns are across different communicative and social environments.

Finally, this study focused exclusively on textual data and therefore did not investigate nonverbal or multimodal expressions of affect. Emotional communication in political contexts often involves paralinguistic cues such as vocal tone and facial expressions. While text remains a central channel for political discourse, particularly in online settings, integrating nonverbal and multimodal data could offer a richer picture of affective dynamics. Collecting and annotating such multimodal data presents methodological challenges, including the need for large-scale audiovisual datasets and reliable affective annotation. Nonetheless, extending this framework to video or audio data from political debates, interviews, or campaign events would provide valuable opportunities to examine identity-based and issue-based affective alignment while incorporating nonverbal and multimodal signals of affect.

Conclusion

In sum, our study harnesses the power of language models to study affective alignment at scale. We demonstrate that issue-based affective alignment is significantly stronger than identity-based affective alignment, which may provide an entry point for interventions aimed at reducing the growing political divides in our society. Looking ahead, the methodology and findings presented in this paper provide a foundation for future research aimed at devising more effective strategies to mitigate polarization. This could involve developing interventions that reduce affective polarization by increasing cross-partisan goodwill and empathy69 or correcting misperceptions of the magnitude of issue-based affective misalignment70. By applying computational techniques to real-world discourse, our work provides a scalable framework for assessing affective alignment and evaluating the potential effectiveness of depolarization strategies.

Supplementary information

Reporting Summary (2.5MB, pdf)

Acknowledgements

We thank Drs. James Evans and Alex Shaw for helpful feedback on the project. We are grateful for and want to acknowledge all the developers’ and the community’s efforts that contributed to developing the software we used for the paper (see Supplementary Note 3). This research was supported by seed grants from the University of Chicago Data and Democracy Initiative to Y.C.L. and J.C.J., and a National Science Foundation Smart and Connected Communities (NSF S&CC) grant #1952050 from the Division of Computer and Network Systems (CNS) to M.G.B. This work was completed in part with resources provided by the University of Chicago’s Research Computing Center. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author contributions

N.R. contributed to conceptualization, data curation, formal analysis, investigation, methodology, software, visualization, writing—original draft, and writing—review & editing. J.C.J. contributed to conceptualization, funding acquisition, methodology, supervision, and writing—review & editing. M.G.B. contributed to conceptualization, funding acquisition, methodology, supervision, and writing—review & editing. Y.C.L. contributed to conceptualization, funding acquisition, methodology, supervision, writing—original draft, and writing—review & editing.

Peer review

Peer review information

Communications Psychology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary handling editors: Philipp Schmid and Marike Schiffer. A peer review file is available.

Data availability

The data to replicate the results is available at https://osf.io/w2t6q/.

Code availability

The code to replicate the results is available at https://osf.io/w2t6q/.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Nakwon Rim, Email: nwrim@uchicago.edu.

Yuan Chang Leong, Email: ycleong@uchicago.edu.

Supplementary information

The online version contains supplementary material available at 10.1038/s44271-026-00430-x.

References

  • 1.Ben Shapiro [@benshapiro]. Democrats love abortion. The age of ‘safe, legal and rare’ is long gone. Now it’s ‘abortionists are the greatest heroes of American civilization.’ Twitterhttps://x.com/benshapiro/status/1768350649765233129 (2024).
  • 2.Mark Levin: Biden, Democrats hate the Constitution. Fox Newshttps://www.foxnews.com/video/6354185050112 (2024).
  • 3.Berlatsky, N. Trump and Republicans don’t hate gun control because of the NRA. They just love guns. NBC Newshttps://www.nbcnews.com/think/opinion/trump-republicans-don-t-hate-gun-control-because-nra-they-ncna1057841 (2019).
  • 4.Dias, I. At the center of the right-wing revival? Hating immigrants. Mother Joneshttps://www.motherjones.com/politics/2024/08/natcon-immigration-new-right-jd-vance-nationalism/ (2024).
  • 5.Pew Research Center. Political Polarization in the American Public. https://www.pewresearch.org/politics/2014/06/12/political-polarization-in-the-american-public/ (2014).
  • 6.Abramowitz, A. I. & Saunders, K. L. Is polarization a myth?J. Polit.70, 542–555 (2008). [Google Scholar]
  • 7.Iyengar, S., Lelkes, Y., Levendusky, M., Malhotra, N. & Westwood, S. J. The origins and consequences of affective polarization in the United States. Annu. Rev. Polit. Sci.22, 129–146 (2019). [Google Scholar]
  • 8.Iyengar, S., Sood, G. & Lelkes, Y. Affect, not ideology: a social identity perspective on polarization. Public Opin. Q.76, 405–431 (2012). [Google Scholar]
  • 9.Finkel, E. J. et al. Political sectarianism in America. Science370, 533–536 (2020). [DOI] [PubMed] [Google Scholar]
  • 10.Haidt, J. The emotional dog and its rational tail: a social intuitionist approach to moral judgment. Psychol. Rev.108, 814–834 (2001). [DOI] [PubMed] [Google Scholar]
  • 11.Lodge, M. & Taber, C. S. The Rationalizing Voter (Cambridge University Press, 2013).
  • 12.Westen, D. The Political Brain: The Role of Emotion in Deciding the Fate of the Nation (PublicAffairs, 2008).
  • 13.Meffert, M. F., Guge, M. & Lodge, M. Good, bad, and ambivalent: the consequences of multidimensional political attitudes. in (eds Saris, W. E. & Sniderman, P. M.) Studies in Public Opinion: Attitudes, Nonattitudes, Measurement Error, and Change, 63–92 (Princeton University Press, 2018).
  • 14.Gainous, J., Martinez, M. D. & Craig, S. C. The multiple causes of citizen ambivalence: attitudes about social welfare policy. J. Élect. Public Opin. Parties20, 335–356 (2010). [Google Scholar]
  • 15.Levendusky, M. S. & Malhotra, N. Misperceptions of partisan polarization in the American public. Public Opin. Q80, 378–391 (2016). [Google Scholar]
  • 16.Wilson, A. E., Parker, V. A. & Feinberg, M. Polarization in the contemporary political and media landscape. Curr. Opin. Behav. Sci.34, 223–228 (2020). [Google Scholar]
  • 17.Spring, V. L., Cameron, C. D. & Cikara, M. The upside of outrage. Trends Cogn. Sci.22, 1067–1069 (2018). [DOI] [PubMed] [Google Scholar]
  • 18.Brader, T. Striking a responsive chord: how political ads motivate and persuade voters by appealing to emotions. Am. J. Polit. Sci.49, 388–405 (2005). [Google Scholar]
  • 19.Lavine, H., Thomsen, C. J., Zanna, M. P. & Borgida, E. On the primacy of affect in the determination of attitudes and behavior: the moderating role of affective-cognitive ambivalence. J. Exp. Soc. Psychol.34, 398–421 (1998). [Google Scholar]
  • 20.Halperin, E. & Pliskin, R. Emotions and emotion regulation in intractable conflict: studying emotional processes within a unique context. Polit. Psychol.36, 119–150 (2015). [Google Scholar]
  • 21.Clifford, S. How emotional frames moralize and polarize political attitudes. Polit. Psychol.40, 75–91 (2019). [Google Scholar]
  • 22.Zhuravskaya, E., Petrova, M. & Enikolopov, R. Political effects of the internet and social media. Annu. Rev. Econ.12, 415–438 (2020). [Google Scholar]
  • 23.Schaffner, B. F. & Luks, S. Misinformation or expressive responding? What an inauguration crowd can tell us about the source of political misinformation in surveys. Public Opin. Q.82, 135–147 (2018). [Google Scholar]
  • 24.Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. Preprint at https://arxiv.org/abs/1301.3781v3 (2013).
  • 25.Caliskan, A., Bryson, J. J. & Narayanan, A. Semantics derived automatically from language corpora contain human-like biases. Science356, 183–186 (2017). [DOI] [PubMed] [Google Scholar]
  • 26.Garg, N., Schiebinger, L., Jurafsky, D. & Zou, J. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proc. Natl. Acad. Sci. USA115, E3635–E3644 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V. & Kalai, A. T. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. in Advances in Neural Information Processing Systems, Vol 29 (Curran Associates, Inc., 2016).
  • 28.Bailey, A. H., Williams, A. & Cimpian, A. Based on billions of words on the internet, people = men. Sci. Adv.8, eabm2463 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lewis, M. & Lupyan, G. Gender stereotypes are reflected in the distributional structure of 25 languages. Nat. Hum. Behav.4, 1021–1028 (2020). [DOI] [PubMed] [Google Scholar]
  • 30.Charlesworth, T. E. S., Yang, V., Mann, T. C., Kurdi, B. & Banaji, M. R. Gender stereotypes in natural language: word embeddings show robust consistency across child and adult language corpora of more than 65 million words. Psychol. Sci.32, 218–240 (2021). [DOI] [PubMed] [Google Scholar]
  • 31.Charlesworth, T. E. S., Caliskan, A. & Banaji, M. R. Historical representations of social groups across 200 years of word embeddings from Google Books. Proc. Natl. Acad. Sci. USA119, e2121798119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Rathje, S. et al. GPT is an effective tool for multilingual psychological text analysis. Proc. Natl. Acad. Sci. USA121, e2308950121 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Brady, W. J., Jackson, J. C., Lindström, B. & Crockett, M. J. Algorithm-mediated social learning in online social networks. Trends Cogn. Sci.27, 947–960 (2023). [DOI] [PubMed] [Google Scholar]
  • 34.McLoughlin, K. L. & Brady, W. J. Human-algorithm interactions help explain the spread of misinformation. Curr. Opin. Psychol.56, 101770 (2024). [DOI] [PubMed] [Google Scholar]
  • 35.Levendusky, M. How Partisan Media Polarize America (University of Chicago Press, Chicago, 2013).
  • 36.DellaVigna, S. & Kaplan, E. The fox news effect: media bias and voting*. Q J. Econ.122, 1187–1234 (2007). [Google Scholar]
  • 37.Martin, G. J. & Yurukoglu, A. Bias in cable news: persuasion and polarization. Am. Econ. Rev.107, 2565–2599 (2017). [Google Scholar]
  • 38.Baumgartner, J., Zannettou, S., Keegan, B., Squire, M. & Blackburn, J. The pushshift Reddit dataset. Proc. Int. AAAI Conf. Web Soc. Media14, 830–839 (2020). [Google Scholar]
  • 39.Waller, I. & Anderson, A. Quantifying social organization and political polarization in online platforms. Nature600, 264–268 (2021). [DOI] [PubMed] [Google Scholar]
  • 40.Rajadesingan, A., Budak, C. & Resnick, P. Political discussion is abundant in non-political subreddits (and less toxic). Proc. Int. AAAI Conf. Web Soc. Media15, 525–536 (2021). [Google Scholar]
  • 41.Řehůřek, R. & Sojka, P. Software Framework for Topic Modelling with Large Corpora. in Proc. LREC 2010 Workshop on New Challenges for NLP Frameworks, 45–50 (ELRA, 2010).
  • 42.Rozado, D. & al-Gharbi, M. Using word embeddings to probe sentiment associations of politically loaded terms in news and opinion articles from news media outlets. J. Comput. Soc. Sc.5, 427–448 (2022). [Google Scholar]
  • 43.Updated Media Bias Chart: Version 1.1. AllSideshttps://www.allsides.com/blog/updated-allsides-media-bias-chart-version-11 (2019).
  • 44.Teitelbaum, L. & Simchon, A. Neural text embeddings in psychological research: a guide with examples in R. Psychol. Methods10.1037/met0000768 (2025). [DOI] [PubMed]
  • 45.Kozlowski, A. C., Taddy, M. & Evans, J. A. The geometry of culture: analyzing the meanings of class through word embeddings. Am. Socio. Rev.84, 905–949 (2019). [Google Scholar]
  • 46.Grand, G., Blank, I. A., Pereira, F. & Fedorenko, E. Semantic projection recovers rich human knowledge of multiple object features from word embeddings. Nat. Hum. Behav.6, 975–987 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.An, J., Kwak, H. & Ahn, Y.-Y. SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment. In Proc. 56th Annual Meeting of the Association for Computational Linguistics Vol. 1 (eds Gurevych, I. & Miyao, Y.) Long Paper 2450–2461 (Association for Computational Linguistics, 2018).
  • 48.Garten, J. et al. Dictionaries and distributions: Combining expert knowledge and large scale textual data content analysis: Distributed dictionary representation. Behav. Res.50, 344–361 (2018). [DOI] [PubMed] [Google Scholar]
  • 49.Mohammad, S. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In Proc. 56th Annual Meeting of the Association for Computational Linguistics. (eds Gurevych, I. & Miyao, Y.) Vol. 1, Long Papers. 174–184 (Association for Computational Linguistics, 2018).
  • 50.Pennington, J., Socher, R. & Manning, C. Glove: Global vectors for word representation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1532–1543 (Association for Computational Linguistics, 2014).
  • 51.Bojanowski, P., Grave, E., Joulin, A. & Mikolov, T. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics5, 135–146 (2017).
  • 52.Grave, E., Bojanowski, P., Gupta, P., Joulin, A. & Mikolov, T. Learning Word Vectors for 157 Languages. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (eds Calzolari, N. et al.) (European Language Resources Association, 2018).
  • 53.Yeshurun, Y. et al. Same story, different story: the neural representation of interpretive frameworks. Psychol. Sci.28, 307–319 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Lyu, Y., Su, Z., Neumann, D., Meidenbauer, K. L. & Leong, Y. C. Hostile Attribution bias shapes neural synchrony in the left ventromedial prefrontal cortex during ambiguous social narratives. J. Neurosci. 44, e1252232024 (2024). [DOI] [PMC free article] [PubMed]
  • 55.Grattafiori, A. et al. The Llama 3 herd of models. Preprint at 10.48550/arXiv.2407.21783 (2024).
  • 56.Pew Research Center. Public opinion on abortion. https://www.pewresearch.org/religion/fact-sheet/public-opinion-on-abortion/ (2025).
  • 57.Sterling, J., Jost, J. T. & Bonneau, R. Political psycholinguistics: a comprehensive analysis of the language habits of liberal and conservative social media users. J. Personal. Soc. Psychol.118, 805–834 (2020). [DOI] [PubMed] [Google Scholar]
  • 58.Carrella, F. et al. Different honesty conceptions align across US politicians’ tweets and public replies. Nat. Commun.16, 1409 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Brady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A. & Van Bavel, J. J. Emotion shapes the diffusion of moralized content in social networks. Proc. Natl. Acad. Sci. USA114, 7313–7318 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Hackenburg, K., Brady, W. J. & Tsakiris, M. Mapping moral language on US presidential primary campaigns reveals rhetorical networks of political division and unity. PNAS Nexus2, pgad189 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Garzón-Velandia, D. C. & Pennebaker, J. W. A linguistic strategy to measure negative affective polarization through text content. J. Lang. Soc. Psychol.44, 881–906 (2025). [Google Scholar]
  • 62.Törnberg, P. How digital media drive affective polarization through partisan sorting. Proc. Natl. Acad. Sci. USA119, e2207159119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Brady, W. J., Crockett, M. J. & Van Bavel, J. J. The MAD model of moral contagion: the role of motivation, attention, and design in the spread of moralized content online. Perspect. Psychol. Sci.15, 978–1010 (2020). [DOI] [PubMed] [Google Scholar]
  • 64.Lakoff, G. Moral Politics: How Liberals and Conservatives Think (University of Chicago Press, 2016).
  • 65.Graham, J., Haidt, J. & Nosek, B. A. Liberals and conservatives rely on different sets of moral foundations. J. Personal. Soc. Psychol.96, 1029–1046 (2009). [DOI] [PubMed] [Google Scholar]
  • 66.Jost, J. T., Glaser, J., Kruglanski, A. W. & Sulloway, F. J. Political conservatism as motivated social cognition. Psychol. Bull.129, 339–375 (2003). [DOI] [PubMed] [Google Scholar]
  • 67.Hibbing, J. R., Smith, K. B. & Alford, J. R. Differences in negativity bias underlie variations in political ideology. Behav. Brain Sci.37, 297–307 (2014). [DOI] [PubMed] [Google Scholar]
  • 68.Pendse, S. R., Rochford, B., Kumar, N. & De Choudhury, M. The Role of Partisan Culture in Mental Health Language Online. Proc. ACM Hum.-Comput. Interact.9, CSCW248:1-CSCW248:42 (2025).
  • 69.Santos, L. A., Voelkel, J. G., Willer, R. & Zaki, J. Belief in the utility of cross-partisan empathy reduces partisan animosity and facilitates political persuasion. Psychol. Sci.33, 1557–1573 (2022). [DOI] [PubMed] [Google Scholar]
  • 70.Voelkel, J. G. et al. Megastudy testing 25 treatments to reduce antidemocratic attitudes and partisan animosity. Science386, eadh4764 (2024). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Reporting Summary (2.5MB, pdf)

Data Availability Statement

The data to replicate the results is available at https://osf.io/w2t6q/.

The code to replicate the results is available at https://osf.io/w2t6q/.


Articles from Communications Psychology are provided here courtesy of Nature Publishing Group

RESOURCES