Henry#

Financial sentiment dictionary designed for analysing tone in earnings press releases 📱.#

Henry summary

Composition:

  • 189 unigram entries (104 positive and 85 negative)

  • Binary Class (positive, negative)

Creation Methodology:

  • Compiled from 1,366 earnings press releases.

  • Selected words based on contextual analysis, focusing on directional collocates.

Evaluation: Henry (2006) validated the dictionary in two stages. Initially, contextual analysis revealed that 77-100% of upward directional collocates and 68-89% of downward directional collocates aligned with assumptions. The first analysis established the reliability of adding ‘directional’ words (i.e “increased” and “decreased”), suggesting that these words indeed convey the intended positive and negative sentiment. Using these validated Positive and Negative word lists, the author investigated whether capital market behaviour is influenced by the tone and other stylistic attributes of earnings press releases. Abnormal market returns (CARs) were defined as the accumulated return exceeding the CRSP equal-weighted market portfolio over a 3-day event window from t-1 to t+1, where t is the earnings release date.

Linear regression analysis demonstrated a significant positive association between CARs and Tone (p=0.02). Tone is calculated as the normalised difference between positive and negative words in the earnings release, capturing the relative balance of sentiment. This approach, though seemingly simplistic, has shown utility in prior research (Abrahamson and Amir, 1996; Clatworthy and Jones, 2003; Hussainey et al., 2003; Uang et al., 2005; cited in Henry, 2006, pp.14-15).

This relationship persisted when controlling for Words_AL (length of the text), indicating that Tone impacts market response independently of other stylistic attributes in earnings press releases. The second analysis added a level of rigour, providing confidence that the sentiment dictionary is well-suited for capturing the sentiment polarity in financial context.

Usage Guidance: Tailored for sentiment analysis in earnings press releases. Excellent starting point with demonstrated reliability in financial context. Access processed dictionary via sentibank.archive.load().dict("Henry_v2006").

📋 Introduction#

Earnings press releases are important disclosures where companies communicate financial performance to investors. Henry (2006) developed a financial sentiment dictionary to examine if textual tone and styles in earnings announcements influence investor decisions. Henry’s (2006) methodology distinguishes itself through the disambiguation of directional terminology based on co-occurring context to validate polarity assumptions.

📚 Original Dictionary#

Henry (2006) examined earnings press releases by the firm, rather than annual report disclosure or press coverage. This was mainly due to the fact that firms can strategically choose the tone of an earnings press (Henry, 2006, p.8). For instance, firms may emphasise favourable aspects, compare against selective benchmarks, or use language with greater positive emotional affect.

The sample included 1,366 annual releases from 562 technology and telecommunications firms between 1998-2002, sourced from Factiva and LexisNexis Academic. This industry and time period was chosen due to the stock market uncertainty, when investors rely more on non-financial disclosures (Amir and Lev, 1996, cited in Henry, 2006, p.17).

Though not explicitly stated, the word lists of Positive_Tone and Negative Tone were seemingly compiled from the press releases. Synonymy and polysemy were likely not major issues since the narrow domain limited ambiguity. For example, in earnings press releases, the word “net” is more likely to be an antonym of “gross”. Thus, disambiguating affect rather than sense was the priority, specifically distinguishing positive and negative ‘directional’ words based on financial item associations.

Directional words like “increased” and “decreased” can be ambiguous in earnings releases. For example, “increased expenses” has a negative tone despite the positive word “increased.” To disambiguate directional words, Henry (2006) examined the context of each occurrence in the 1,366 press releases. Specifically, Henry (2006) explored the percentage of times each word appeared with ‘inherently desirable’ (i.e revenue) or ‘inherently undesirable’ (i.e expenses) financial items. These items were selected from common equity valuation models in Stowe et al. (2002, cited in Henry, 2006, p.34), such as the dividend discount model or free cash flow models. Using inputs from established valuation models provides an objective way to classify items as ‘desirable’ or ‘undesirable’ based on their relationship with firm value. For instance, under these models, all else being equal, increased revenue directly contributes to higher firm valuation, while increased expenses detract from firm value. Relying on standardised models helped avoiding subjective assessments of whether a financial item is inherently good or bad for firm valuation.

To determine whether a particular directional word is generally used in a positive or negative sense, ‘collocates’ for each of the directional terms in the tone thesaurus is obtained by analysing a ±3 word window. Empirical evidence shows that people can identify the sense of a word when given a relatively narrow collocate horizon (i.e a window of ±2 words of the context (Miller and Leacock, 2000, cited in Henry, 2006, p.33). The approach to affect disambiguation in Henry (2006) uses a slightly wider collocate horizon of ±3 words in order to capture more context. This approach was analogous to statistical disambiguation techniques.

To determine the positive or negative affect of each directional word, Henry (2006) examined ‘collocates’ within a ±3 word window, wider than the ±2 word window found sufficient for sense disambiguation (Miller and Leacock, 2000, cited in Henry, 2006, p.33). Analysing this expanded contextual horizon allowed for more robust affect disambiguation. The approach identifies the percentage of times each directional term co-occurs with inherently desirable or undesirable financial items in the corpus of earnings releases. This contextual analysis technique mirrored statistical disambiguation methods that rely on the strong correlation between words and their surrounding textual features (Manning and SchĂŒtze, 1999). Leveraging the narrow sublanguage of earnings reports, where word sense is relatively unambiguous, the study focuses specifically on disambiguating affect rather than meaning.

In such examination, of the 9,682 left collocates examined for “increased,” 83% of non-neutral uses related to desirable items like revenues (695 times). Similarly, of 5,457 right collocates for “increased”, 66% conveyed positive affect, commonly through items like revenues (143 times) and sales (98 times). Across all upward directional terms, 77-100% of left collocates and 66-93% of right collocates indicated a positive context. And across all downward directional terms, 68%-89% of left colocates and 45%-94% of right collocates indicated a negative context. The entire result is shown in the table below.

L3 to L1 Collocates

R1 to R3 Collocates

POSITIVE: UPWARD DIRECTIONAL WORDS

Total examined

Nonneutral

Non-neutral collocates consistent with positive affect

Total examined

Nonneutral

Non-neutral collocates consistent with positive affect

INCREASED

9682

3118

2602 (83%)

5457

909

606 (66%)

INCREASE

4965

612

472 (77%)

6703

1503

1121 (75%)

GROWTH

7848

2073

2029 (98%)

5414

836

777 (93%)

MORE

3634

544

517 (95%)

4845

392

339 (86%)

UP

3168

530

530 (100%)

2577

161

110 (68%)

L3 to L1 Collocates

R1 to R3 Collocates

NEGATIVE: DOWNWARD DIRECTIONAL WORDS

Total examined

Nonneutral

Non-neutral collocates consistent with negative affect

Total examined

Nonneutral

Non-neutral collocates consistent with negative affect

DECREASE

1120

82

73 (89%)

1761

378

309 (82%)

DECREASED

1831

798

547 (68%)

922

71

32 (45%)

DOWN

1338

166

145 (87%)

1021

183

173 (94%)

LESS

1117

411

278 (67%)

1478

217

99 (46%)

LOWER

1121

136

102 (75%)

1395

600

375 (63%)

The consistent preferential use of upward and downward directionals affirmed their assumed polarity groupings within the 105-lexicon Positive_Tone and 85-lexicon Negative_Tone lists respectively.

from sentibank import archive

load = archive.load()
Henry = load.origin("Henry_v2006")
Henry (Henry, 2006)
word polarity
Loading... (need help?)

đŸ§č Processed Dictionary#

From the original word lists, no notable changes were made except removing the duplicate “leading” in the Positive_Tone word list.

Note#

[1] The CRSP equal-weighted market portfolio is essentially an index representing the overall stock market performance. The 3-day event window refers to the day before the earnings release (t-1), the day of the earnings release (t), and the day after (t+1).

The abnormal market return (CAR) captures how a company’s stock price performed over this 3-day period surrounding the earnings announcement, compared to how the overall market performed. Specifically, the CAR is the company’s cumulative stock returns over the event window minus the returns of the market portfolio over the same window. So if the company’s stock price increases more than the overall market over those 3 days, it would have a positive abnormal return, indicating the market reacted favourably to its earnings announcement.

In other words, the CAR shows if the stock price movement was abnormally high or low relative to the broader market movement around the time of the earnings release.