Download financial data for investing with Python

This blog post showcases one way to download firm excess returns and characteristics from the BIGFI server and applies it in a simple investing context.

Requirements

  • Jupyter Notebook + Python 3
  • environment.txt file with credentials (provided by BIGFI or CBS Library)
  • Code must be run at CBS or via a VPN to access the server

Download excess returns and trading signals

The code below downloads data from a BIGFI server and saves the data to data.csv (add to code – AS) in your current working directory.

The provided code also performs the following steps to prepare the data:

  • Download data on excess returns and 3 characteristics for the US. The characteristics are: be_me, ret_12_1, market_equity. For more information on these, see JKP Docs.
  • Include data between 2017-12-31 and 2022-12-31
  • Exclude observations with missing market equity in month t and missing return in month t+1.
  • Standardize the characteristics by subtracting the cross-sectional mean and dividing
    by the standard deviation at each time.
  • Handle missing characteristics by replacing them with the cross-sectional (i.e. within
    month) median, which is zero due to the previous step.

Let’s preview the data. We downloaded a firm-month panel with 218,298 rows and 9 columns containing information on stocks’ excess returns next month, their current book-to-market, their 1-year returns over the past year and their current market equity.

Now let’s plot the number of stocks we observe each month:

As seen in the plot, each month we start out with about 4200 stocks in the beginning of 2018 and then the number of stocks grows to about 5200 at the end in January 2022.

A simple model for investing

Let’s do a simple exercise where we train an OLS model to use the 3 characteristics to predict returns by fitting the following model

ret_exc_lead_1m = a + b * be_me + c * ret_12_1 + d * market_equity + error

Running that on the entire panel yields the following results:

What is the trading strategy implied by this model? It is essentially a mix of Value, Momentum and Size strategies. Since it has a positive coefficient on all 3 characteristics this is a strategy that goes long cheap firms (low book-to-market), goes long on firms with high 1-year past returns and goes long firms with a large market equity consistent with Value, Momentum and Size strategies.