Part 3: Practical Examples


Regression Analysis Following Li (2008)

See: Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45(2–3), 221–247.


Regression Equation

Table 3, Li (2008)


Earnings is operating earnings (data178 of Compustat) scaled by book value of assets.

Length is the natural logarithm of the total words in annual report.


Step-by-step Tutorial

This tutorial will walk you through a simple regression analysis on financial data using Python. We'll use the pandas, numpy, and statsmodels libraries along with my library, jtext.


Step 1: Import Necessary Libraries

These libraries are essential for loading, transforming, and analyzing data.


Step 2: Load the Dataset

Step 3: Filter Relevant Data

Select columns that are relevant for the analysis:


Step 4: Build Variables

Step 5: Process Text Data

For each entry in annual_csv:


Step 6: Clean the Data


Step 7: Prepare Data for Regression


Step 8: Fit the Regression Model