Load Dictionaries#

📚 Available Dictionaries#

sentibank offers a comprehensive collection of sentiment dictionaries, including 15 original dictionaries and 43 preprocessed versions. To access the original dictionaries, you can use the predefined lexicon identifier following the {NAME}_{VERSION} convention.

For the preprocessed dictionaries, there are two naming conventions:

  1. {NAME}_{VERSION}: This convention indicates that only compulsory processing has been applied to the base lexicon.

  2. {NAME}_{VERSION}_{refined}: This structure specifies additional transformations or discretionary refinements that have been applied to the base lexicon.

For example, NoVAD_v2013_boosted applies arousal-based adjustments to intensify extreme valence values and dampen neutral ones, providing a richness-preserving single score.

To view the available predefined lexicon identifiers and their corresponding dictionaries, please click and open the List of Available Dictionaries below.

List of Available Dictionaries

Sentiment Dictionary

Affiliated Institution
(Principal Investigator)

Description

Genre

Domain

Predefined Identifiers (preprocessed)

AFINN
(Nielsen, 2011)

DTU Informatics
(Technical University of Denmark)

General purpose lexicon with sentiment ratings for common emotion words.

Social Media

General

AFINN_v2009, AFINN_v2011, AFINN_v2015

Aigents+
(Raheman et al., 2022)

Autonio Foundation

Lexicon optimised for social media posts related to cryptocurrencies.

Social Media

Cryptocurrency

Aigents+_v2022

ANEW
(Bradley and Lang, 1999)

NIMH Center for Emotion and Attention
(University of Florida)

Provides normative emotional ratings across pleasure, arousal, and dominance dimensions.

General

Psychology

ANEW_v1999_simple, ANEW_v1999_weighted

Dictionary of Affect in Language (DAL)
(Whissell, 1989; Whissell, 2009)

Laurentian University

Lexicon designed to quantify pleasantness, activation, and imagery dimensions across diverse everyday English words.

General

General

DAL_v2009_norm, DAL_v2009_boosted

Discrete Emotions Dictionary (DED)
(Fioroni et al., 2022)

Gallup

Lexicon focused on precisely distinguishing four key discrete emotions in political communication

News

Political Science

DED_v2022

General Inquirer
(Stone et al., 1962)

Harvard University

Lexicon capturing broad psycholinguistic dimensions across semantics, values and motivations.

General

Psychology, Political Science

HarvardGI_v2000

Henry
(Henry, 2006)

University of Miami

Leixcon designed for analysing tone in earnings press releases.

Corporate Communication (Earnings Press Releases)

Finance

Henry_v2006

MASTER
(Loughran and McDonland, 2011; Bodnaruk, Loughran and McDonald, 2015)

University of Notre Dame

Financial lexicons covering expressions common in business writing.

Regulatory Filings (10-K)

Finance

MASTER_v2022

Norms of Valence, Arousal and Dominance (NoVAD)
(Warriner, Kuperman and Brysbaert, 2013; Warriner and Kuperman, 2014)

McMaster University

A lexicon of 14,000 common English lemmas across valence, arousal, and dominance dimensions.

General

Psychology

NoVAD_v2013_norm, NoVAD_v2013_boosted

OpinionLexicon
(Hu and Liu, 2004)

University of Illinois Chicago

Opinion words tailored for sentiment analysis of product reviews.

Product Reviews

Consumer Products

OpinionLexicon_v2004

SenticNet
(Cambria et al., 2010; Cambria, Havasi and Hussain, 2012; Cambria, Olsher and Rajagopal, 2014; Cambria et al., 2016, 2018, 2020, 2022)

Sentic Research Group
(Initiative at Massachusetts Institute of Technology, currently maintained by Nanyang Technological University)

Conceptual lexicon providing multidimensional sentiment analysis for commonsense concepts and expressions.

General

General

SenticNet_v2010, SenticNet_v2012, SenticNet_v2012_attributes, SenticNet_v2012_semantics, SenticNet_v2014, SenticNet_v2014_attributes, SenticNet_v2014_semantics, SenticNet_v2016, SenticNet_v2016_attributes, SenticNet_v2016_mood, SenticNet_v2016_semantics, SenticNet_v2018, SenticNet_v2018_attributes, SenticNet_v2018_mood, SenticNet_v2018_semantics, SenticNet_v2020, SenticNet_v2020_attributes, SenticNet_v2020_mood, SenticNet_v2020_semantics, SenticNet_v2022, SenticNet_v2022_attributes, SenticNet_v2022_mood, SenticNet_v2022_semantics

SentiWordNet
(Esuli and Sebastiani, 2006; Baccianella, Esuli and Sebastiani, 2010)

Institute of Information Science and Technologies
(Consiglio Nazionale delle Ricerche)

Lexicon associating WordNet synsets with positive, negative, and objective scores.

General

General

SentiWordNet_v2010_simple, SentiWordNet_v2010_logtransform

VADER
(Hutto and Gilbert, 2014)

Georgia Institute of Technology

General purpose lexicon optimised for social media and microblogs.

Social Media

General

VADER_v2014

WordNet-Affect
(Strapparava and Valitutti, 2004; Valitutti, Strapparava and Stock, 2004; Strapparava, Valitutti and Stock, 2006)

Institute for Scientific and Technological Research
(Fondazione Bruno Kessler)

Hierarchically organised affective labels providing a granular emotional dimension.

General

Psychology

WordNet-Affect_v2006

📖 Load Preprocessed Dictionaries#

The sentibank.archive module provides access to 15 original and 43 preprocessed sentiment dictionaries. To load a preprocessed dictionary in dict format:

from sentibank import archive

load = archive.load()
vader = load.dict("VADER_v2014")
{'$:': -1.5,
 '%)': -0.4,
 '%-)': -1.5,
 '&-:': -0.4,
 '&:': -0.7,
 "( '}{' )": 1.6,
 '(%': -0.9,
 ...}

📘 Load Original Dictionaries#

To load the original (unprocessed) dictionary as a pd.DataFrame:

from sentibank import archive

load = archive.load()
afinn = load.origin("AFINN_v2015")
lexicon score
0 abandon -2
1 abandoned -2
2 abandons -2
3 abducted -2
4 abduction -2
... ... ...
3377 yucky -2
3378 yummy 3
3379 zealot -2
3380 zealots -2
3381 zealous 2

3382 rows × 2 columns