Dictionary of Affect in Language (DAL)#

A sentiment lexicon designed to quantify core affective dimensions across diverse everyday English words 🗣️.#

DAL summary

Composition:

  • Approximately 8.7k lexicons, rated on pleasantness , activation and imagery using a 3-point scale.

  • The processed dictionary contains approximately 7.8k lexicons (7,825 unigrams and 33 bigrams) with a composite sentiment score incorporating pleasantness and activation.

Creation Methodology:

  • Words sourced from the Brown corpus of everyday English

  • Rated by 100+ student volunteers in psychology courses

  • Volunteers assessed random sets of 100 words at a time

Evaluation: Whissell (2009) validated the dictionary using four methods:

  1. Connotative Validity: Rated by 100+ student volunteers in psychology courses. Participants were presented with random sets of 100 words at a time. They rated each word on three dimensions using a 3-point scale: pleasantness (1 being unpleasant, 3 being pleasant), activation (1 being passive, 3 being active), and imagery (1 indicating difficulty envisioning, 3 indicating easy to envision).

  2. Discriminant Validity: Correlations between the three dictionary dimensions (Pleasantness, Activation, Imagery) were statistically significant but very weak (p < .01). The Pleasantness-Activation correlation was 0.097, Pleasantness-Imagery was 0.062, and Activation-Imagery was 0.049. This suggests the emotions captured by each dimension were largely distinct.

  3. Concordance Validity: DAL ratings were compared to ANEW. The correlation between DAL Pleasantness and ANEW Valence was 0.83. The DAL Activation and ANEW Arousal correlation was 0.60. (Note: Pleasantness-Valence and Activation-Arousal capture the same underlying emotions). Correlations were stronger for Pleasantness than Activation, likely because Pleasantness is more salient.

  4. Contextual Validity: The matching rate (word overlap) was tested using 16 additional ~100-word text samples from print sources (news, magazines, books and song lyrics). The original DAL had a 19% average match rate versus 90% for the revised DAL. This 90% match rate was also seen in a large and diverse natural language corpus (n=348,000 words) characterised by Whissell (1998).

Usage Guidance: Useful baseline sentiment measure for everyday English words. Can be used as a general purpose lexicon, or paired domain-specific lexicons for tailored analyses. Access pleasantness/activation based scores via sentibank.archive.load().dict("DAL_v2009_norm"), or pleasantness/imagery based score via sentibank.archive.load().dict("DAL_v2009_boosted").

đź“‹ Introduction#

Whissell’s Dictionary of Affect in Language (DAL) was originally designed to quantify the pleasantness and activation of 4,500 emotional words. However, Whissell (2009) revised DAL to increase its applicability to samples of everyday English words, not just emotional ones.

đź“š Original Dictionary#

The words were originally taken from the Brown corpus, a collection designed to capture everyday uses of natural language. To select the words, Whissell (2009) applied two criteria: the words had to appear at least 10 times in the entire corpus, and they had to be present in two or more distinct samples within the corpus.

After applying these criteria, there was a final dataset of 8,742 words. Over 100 university students, enrolled in psychology courses, then evaluated these words on three dimensions: pleasantness, activation, and imagery. Pleasantness refers to the perceived positivity or negativity of a word. Activation indicates the perceived arousal or energy level. Imagery relates to how easily people could form a mental picture of the word.

Unlike the early version of DAL (Whissell, 1989), which used a 7-point emotion scale, the revised DAL (Whissell, 2009) adopted a simplified 3-point scale. Though prior research by Preston and Colman (2000, cited in Whissell, 2009, p.511) showed some gains in reliability and validity as the number of response categories increased, the improvements were small[1]. Since the marginal gains did not outweigh the benefits of faster administration and simplicity from having fewer scale points, the authors opted for a straightforward 3-point scale to simplify dictionary creation and use.

Participants were presented with random sets of 100 words at a time. They rated each word on three dimensions using a 3-point scale: pleasantness (1 being unpleasant, 3 being pleasant), activation (1 being passive, 3 being active), and imagery (1 indicating difficulty envisioning, 3 indicating easy to envision). Volunteers were told to avoid the middle category unless confident a word belonged there. On average, each word received ratings from nine different volunteers for pleasantness and activation, and from five for imagery. The DAL integrated mean values derived from the assessments provided by volunteers, where the continuous scores fell within the range of [1, 3].

from sentibank import archive 

load = archive.load()
DAL = load.origin("DAL_v2009")
DAL (Whissell, 1989; Whissell, 2009)
word pleasantness activation imagery
Loading... (need help?)

🧹 Processed Dictionary#

First-Pass Processing#

Since the dictionary was not limited to emotional words, DAL contained many function words such as prepositions (i.e “when”), conjunctions (i.e “and”), negations (i.e “not”), and pronouns (i.e “it”). These words had significantly lower imagery scores, indicating they were more abstract (Whissell, 2009, p.513). Words with an imagery score of 1.0 were unlikely to convey emotional meaning, so the 885 words meeting this criteria were removed, leaving 7,858 words in the lexicon.

Two different processed dictionaries have been compiled based on this reduced Dictionary of Affect in Language (DAL):

  1. DAL_v2009_norm uses a composite sentiment score derived from the pleasantness and activation dimensions, drawing on Whissell’s (2009) conceptualization of a two-dimensional affective space.

  2. DAL_v2009_boosted lexicon takes a more experimental approach, exploring an alternative representation of sentiment scores based on pleasantness and imagery dimensions. This representation draws on findings from Warriner, Kuperman and Brysbaert (2013) showing correlations between valence and imageability. It also implements the selective sentiment boosting method proposed by Hutto and Gilbert (2014).

Based on prior research on affective spaces (Rusell, 1978; Posner, Russell and Peterson, 2015; cited in Whissell, 2009, p.510), Whissell highlighted pleasantness and activation as the two primary dimensions. The author suggested pleasantness and activation scores define point estimates of the emotional undertones of language in a two-dimensional space. In other words, these estimates can be depicted as vector representations, with vector length indicating strength (see Fig.1, Whissell, 2009, p.519).

In line with this concept, DAL_v2009_norm represents an overall sentiment score using a vector norm that incorporates both pleasantness and activation dimensions. To standardise the vector norm values, min-max scaling is applied to get sentiment scores ranging [-4, 4].

Compared to the more straightforward DAL_v2009_norm based solely on Whissell’s work, DAL_v2009_boosted combines insights from two additional papers in an attempt to enrich the sentiment encoding. Warriner, Kuperman and Brysbaert (2013, pp.1197-1199) reported correlations between valence, arousal, and dominance with semantic variables like imageability. Their valence and arousal align with DAL’s pleasantness and activation. As imageability increased from a rating of 5 on a 7-point scale, words became more positive (higher valence). This suggests a possibility for representing sentiment scores based on pleasantness and imagery.

However, the association between imageability and valence was only positive starting from a rating of 5 and higher. Following Hutto and Gilbert (2014), we selectively “boost” the sentiment by adding or subtracting 0.293 with different weights to the scaled pleasantness score between [-4,4], based on scaled imagery ratings between [1,7]. For words with scaled imagery ≥5 and <5.65, 0.264 (=0.9x0.293) is added/subtracted. For scaled imagery ≥5.65 and <6.3, 0.278 (=0.95x0.293) is added/subtracted. For scaled imagery ≥6.3, 0.293 is added/subtracted. This results in scores from [-4.293, 4.293], rescaled to [-4, 4]. However, note that this representation has not been as thoroughly validated. Use with discretion if imagery levels are particularly relevant for your analysis aims.

Note#

[1] When increasing the rating scales from 2 to 7, reliability only rose from .88 to .93 and from .83 to .87 for validity.