Henry#
Financial sentiment dictionary designed for analysing tone in earnings press releases đą.#
Henry summary
Composition:
189 unigram entries (104 positive and 85 negative)
Binary Class (
positive
,negative
)
Creation Methodology:
Compiled from 1,366 earnings press releases.
Selected words based on contextual analysis, focusing on directional collocates.
Evaluation: Henry (2006) validated the dictionary in two stages. Initially, contextual analysis revealed that 77-100% of upward directional collocates and 68-89% of downward directional collocates aligned with assumptions. The first analysis established the reliability of adding âdirectionalâ words (i.e âincreasedâ and âdecreasedâ), suggesting that these words indeed convey the intended positive and negative sentiment. Using these validated Positive and Negative word lists, the author investigated whether capital market behaviour is influenced by the tone and other stylistic attributes of earnings press releases. Abnormal market returns (CARs) were defined as the accumulated return exceeding the CRSP equal-weighted market portfolio over a 3-day event window from t-1 to t+1, where t is the earnings release date.
Linear regression analysis demonstrated a significant positive association between CARs and Tone
(p=0.02). Tone
is calculated as the normalised difference between positive and negative words in the earnings release, capturing the relative balance of sentiment. This approach, though seemingly simplistic, has shown utility in prior research (Abrahamson and Amir, 1996; Clatworthy and Jones, 2003; Hussainey et al., 2003; Uang et al., 2005; cited in Henry, 2006, pp.14-15).
This relationship persisted when controlling for Words_AL
(length of the text), indicating that Tone
impacts market response independently of other stylistic attributes in earnings press releases. The second analysis added a level of rigour, providing confidence that the sentiment dictionary is well-suited for capturing the sentiment polarity in financial context.
Usage Guidance: Tailored for sentiment analysis in earnings press releases. Excellent starting point with demonstrated reliability in financial context. Access processed dictionary via sentibank.archive.load().dict("Henry_v2006")
.
đ Introduction#
Earnings press releases are important disclosures where companies communicate financial performance to investors. Henry (2006) developed a financial sentiment dictionary to examine if textual tone and styles in earnings announcements influence investor decisions. Henryâs (2006) methodology distinguishes itself through the disambiguation of directional terminology based on co-occurring context to validate polarity assumptions.
đ Original Dictionary#
Henry (2006) examined earnings press releases by the firm, rather than annual report disclosure or press coverage. This was mainly due to the fact that firms can strategically choose the tone of an earnings press (Henry, 2006, p.8). For instance, firms may emphasise favourable aspects, compare against selective benchmarks, or use language with greater positive emotional affect.
The sample included 1,366 annual releases from 562 technology and telecommunications firms between 1998-2002, sourced from Factiva and LexisNexis Academic. This industry and time period was chosen due to the stock market uncertainty, when investors rely more on non-financial disclosures (Amir and Lev, 1996, cited in Henry, 2006, p.17).
Though not explicitly stated, the word lists of Positive_Tone
and Negative Tone
were seemingly compiled from the press releases. Synonymy and polysemy were likely not major issues since the narrow domain limited ambiguity. For example, in earnings press releases, the word ânetâ is more likely to be an antonym of âgrossâ. Thus, disambiguating affect rather than sense was the priority, specifically distinguishing positive and negative âdirectionalâ words based on financial item associations.
Directional words like âincreasedâ and âdecreasedâ can be ambiguous in earnings releases. For example, âincreased expensesâ has a negative tone despite the positive word âincreased.â To disambiguate directional words, Henry (2006) examined the context of each occurrence in the 1,366 press releases. Specifically, Henry (2006) explored the percentage of times each word appeared with âinherently desirableâ (i.e revenue) or âinherently undesirableâ (i.e expenses) financial items. These items were selected from common equity valuation models in Stowe et al. (2002, cited in Henry, 2006, p.34), such as the dividend discount model or free cash flow models. Using inputs from established valuation models provides an objective way to classify items as âdesirableâ or âundesirableâ based on their relationship with firm value. For instance, under these models, all else being equal, increased revenue directly contributes to higher firm valuation, while increased expenses detract from firm value. Relying on standardised models helped avoiding subjective assessments of whether a financial item is inherently good or bad for firm valuation.
To determine whether a particular directional word is generally used in a positive or negative sense, âcollocatesâ for each of the directional terms in the tone thesaurus is obtained by analysing a ±3 word window. Empirical evidence shows that people can identify the sense of a word when given a relatively narrow collocate horizon (i.e a window of ±2 words of the context (Miller and Leacock, 2000, cited in Henry, 2006, p.33). The approach to affect disambiguation in Henry (2006) uses a slightly wider collocate horizon of ±3 words in order to capture more context. This approach was analogous to statistical disambiguation techniques.
To determine the positive or negative affect of each directional word, Henry (2006) examined âcollocatesâ within a ±3 word window, wider than the ±2 word window found sufficient for sense disambiguation (Miller and Leacock, 2000, cited in Henry, 2006, p.33). Analysing this expanded contextual horizon allowed for more robust affect disambiguation. The approach identifies the percentage of times each directional term co-occurs with inherently desirable or undesirable financial items in the corpus of earnings releases. This contextual analysis technique mirrored statistical disambiguation methods that rely on the strong correlation between words and their surrounding textual features (Manning and SchĂŒtze, 1999). Leveraging the narrow sublanguage of earnings reports, where word sense is relatively unambiguous, the study focuses specifically on disambiguating affect rather than meaning.
In such examination, of the 9,682 left collocates examined for âincreased,â 83% of non-neutral uses related to desirable items like revenues (695 times). Similarly, of 5,457 right collocates for âincreasedâ, 66% conveyed positive affect, commonly through items like revenues (143 times) and sales (98 times). Across all upward directional terms, 77-100% of left collocates and 66-93% of right collocates indicated a positive context. And across all downward directional terms, 68%-89% of left colocates and 45%-94% of right collocates indicated a negative context. The entire result is shown in the table below.
L3 to L1 Collocates |
R1 to R3 Collocates |
|||||
---|---|---|---|---|---|---|
POSITIVE: UPWARD DIRECTIONAL WORDS |
Total examined |
Nonneutral |
Non-neutral collocates consistent with positive affect |
Total examined |
Nonneutral |
Non-neutral collocates consistent with positive affect |
INCREASED |
9682 |
3118 |
2602 (83%) |
5457 |
909 |
606 (66%) |
INCREASE |
4965 |
612 |
472 (77%) |
6703 |
1503 |
1121 (75%) |
GROWTH |
7848 |
2073 |
2029 (98%) |
5414 |
836 |
777 (93%) |
MORE |
3634 |
544 |
517 (95%) |
4845 |
392 |
339 (86%) |
UP |
3168 |
530 |
530 (100%) |
2577 |
161 |
110 (68%) |
L3 to L1 Collocates |
R1 to R3 Collocates |
|||||
---|---|---|---|---|---|---|
NEGATIVE: DOWNWARD DIRECTIONAL WORDS |
Total examined |
Nonneutral |
Non-neutral collocates consistent with negative affect |
Total examined |
Nonneutral |
Non-neutral collocates consistent with negative affect |
DECREASE |
1120 |
82 |
73 (89%) |
1761 |
378 |
309 (82%) |
DECREASED |
1831 |
798 |
547 (68%) |
922 |
71 |
32 (45%) |
DOWN |
1338 |
166 |
145 (87%) |
1021 |
183 |
173 (94%) |
LESS |
1117 |
411 |
278 (67%) |
1478 |
217 |
99 (46%) |
LOWER |
1121 |
136 |
102 (75%) |
1395 |
600 |
375 (63%) |
The consistent preferential use of upward and downward directionals affirmed their assumed polarity groupings within the 105-lexicon Positive_Tone
and 85-lexicon Negative_Tone
lists respectively.
from sentibank import archive
load = archive.load()
Henry = load.origin("Henry_v2006")
word | polarity |
---|---|
Loading... (need help?) |
đ§č Processed Dictionary#
From the original word lists, no notable changes were made except removing the duplicate âleadingâ in the Positive_Tone
word list.
Note#
[1] The CRSP equal-weighted market portfolio is essentially an index representing the overall stock market performance. The 3-day event window refers to the day before the earnings release (t-1), the day of the earnings release (t), and the day after (t+1).
The abnormal market return (CAR) captures how a companyâs stock price performed over this 3-day period surrounding the earnings announcement, compared to how the overall market performed. Specifically, the CAR is the companyâs cumulative stock returns over the event window minus the returns of the market portfolio over the same window. So if the companyâs stock price increases more than the overall market over those 3 days, it would have a positive abnormal return, indicating the market reacted favourably to its earnings announcement.
In other words, the CAR shows if the stock price movement was abnormally high or low relative to the broader market movement around the time of the earnings release.