Replication Crisis?
This page is about how you can download the ‘characteristics’ used to calculate the “factors” in Is There a Replication Crisis in Finance? by Jensen, Kelly and Pedersen (2023).
- To download the factors themself, check this web-page.
- For a documentation of the ‘characteristics’, see the documentation-link at the web-page above.
Downloading the Characteristics from WRDS
Fortunately, the authors (“JKP” = Theis Ingerslev Jensen, Bryan Kelly, and Lasse Heje Pedersen) provide the characteristics via WRDS. They are provided in two ways:
- Using a web-interface for generating queries
- An API-option, using various computer languages (SAS, Python, R, MATLAB, and Stata).
The web-interface possibility should be familiar to WRDS-users and will not be dealt with here.
For the API-option, WRDS offer an integrated cloud-based solution for SAS (SAS-Studio), Python (Jupyter), and R (R-Studio). In addition, WRDS offer a possibility to query the WRDS-database from a local computer using the API. At the web-page mentioned above, the authors provide a description of how to use a local R-installation to download data. Below this possibility is illustrated using Python (Jupyter-style editor) instead.
Connecting to WRDS
1 2 3 4 5 6 7 8 9 10 11 12 |
# Remember: pip install wrds import wrds # First time conn = wrds.Connection() # Prompted for usn and pw conn.create_pgpass_file() # Saves password to harddisk # Next time you want to connect: conn = wrds.Connection(wrds_username='prfi') # NB: "prfi" is my private username matching my password # You need to supply your username matching your saved password :) |
Looking around at WRDS
1 2 3 4 5 6 7 8 9 |
# Shows you the databases at WRDS conn.list_libraries() # The jkp-data from two databases at WRDS: conn.list_tables(library='contrib_global_factor') # Exclusive possibiliy :) conn.list_tables(library='contrib') # Tables from all 'contributers' # Get data even without knowing SQL, but ... conn.get_table(library = 'contrib', table= 'global_factor', obs = 5) |
SQL 101
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# Simplest possible query (with LIMIT 10) conn.raw_sql('SELECT * FROM contrib.global_factor LIMIT 10') # SQL don't care much about casing, linebreaks, spaces (or forgetting ";"!) conn.raw_sql(""" seLEct * fRoM contrib.gLoBaL_factor LIMIT 10 """) # Selecting Columns # -Note: output from conn is a dataframe! output = conn.raw_sql(""" SELECT id, eom, excntry FROM contrib.global_factor LIMIT 10 """) # Selecting rows # -Note the citation marks for strings output = conn.raw_sql(""" SELECT id, eom, excntry FROM contrib.global_factor WHERE excntry = 'DNK' """) # Multiple requirements # - Note: don't need to be on "selected" cols) # - Note the citation marks (and format) for dates output = conn.raw_sql(""" SELECT id, excntry FROM contrib.global_factor WHERE excntry = 'DNK' AND '20010101'>eom AND eom >= '20000101' """) |
Getting the Characteristics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# Observations for Denmark = 69500 pt. conn.raw_sql(""" SELECT count(*) FROM contrib.global_factor WHERE excntry = 'DNK' """) # Obs. used in the JKP-2022 paper: 59110 # (Formatted string literals, or just f-strings) regObs = """ obs_main = 1 AND primary_sec = 1 AND common = 1 AND exch_main = 1 """ dateRestr = """eom <= '20221231'""" conn.raw_sql(f""" SELECT count(*) FROM contrib.global_factor WHERE excntry = 'DNK' AND {regObs} AND {dateRestr} """) # Many cols cols = """permno, permco, gvkey, iid, id, date, excntry, eom""" output = conn.raw_sql(f""" SELECT {cols} FROM contrib.global_factor WHERE excntry = 'DNK' AND {regObs} AND {dateRestr} """) # Many countries, Many columns # - Note the parenthesis (AND is "stronger" than OR) countryStr = """ excntry = 'ARE' OR excntry = 'ARG' OR excntry = 'AUT' """ conn.raw_sql(f""" SELECT {cols} FROM contrib.global_factor WHERE ({countryStr}) AND {regObs} AND {dateRestr} """) # Keeping memory requirements down # - Note the citation marks! countries = ['ARE', 'ARG','AUT'] for country in countries: countryDF = conn.raw_sql(f""" SELECT {cols} FROM contrib.global_factor WHERE excntry = '{country}' AND {regObs} AND {dateRestr} """) print(countryDF.shape) |
Copy-ready lists
1 2 |
# Columns in contrib.global_factor: permno, permco, gvkey, iid, id, date, excntry, eom, obs_main, exch_main, primary_sec, common, size_grp, me, me_company, ret_exc_lead1m, ret_exc, ret, ret_local, ret_lag_dif, prc, prc_local, prc_high, prc_low, bidask, curcd, fx, gics, naics, sic, ff49, dolvol, shares, tvol, adjfct, comp_tpci, crsp_shrcd, comp_exchg, crsp_exchcd, source_crsp, market_equity, enterprise_value, book_equity, assets, sales, net_income, div1m_me, div3m_me, div6m_me, div12m_me, divspc1m_me, divspc12m_me, chcsho_1m, chcsho_3m, chcsho_6m, chcsho_12m, eqnpo_1m, eqnpo_3m, eqnpo_6m, eqnpo_12m, ret_1_0, ret_2_0, ret_3_0, ret_3_1, ret_6_0, ret_6_1, ret_9_0, ret_9_1, ret_12_0, ret_12_1, ret_12_7, ret_18_1, ret_24_1, ret_24_12, ret_36_1, ret_36_12, ret_48_12, ret_48_1, ret_60_1, ret_60_12, ret_60_36, seas_1_1an, seas_1_1na, seas_2_5an, seas_2_5na, seas_6_10an, seas_6_10na, seas_11_15an, seas_11_15na, seas_16_20an, seas_16_20na, at_gr1, ca_gr1, nca_gr1, lt_gr1, cl_gr1, ncl_gr1, be_gr1, pstk_gr1, debt_gr1, sale_gr1, cogs_gr1, sga_gr1, opex_gr1, capx_gr1, inv_gr1, at_gr3, ca_gr3, nca_gr3, lt_gr3, cl_gr3, ncl_gr3, be_gr3, pstk_gr3, debt_gr3, sale_gr3, cogs_gr3, sga_gr3, opex_gr3, capx_gr3, cash_gr1a, inv_gr1a, rec_gr1a, ppeg_gr1a, lti_gr1a, sti_gr1a, intan_gr1a, debtst_gr1a, ap_gr1a, txp_gr1a, debtlt_gr1a, txditc_gr1a, coa_gr1a, col_gr1a, cowc_gr1a, ncoa_gr1a, ncol_gr1a, nncoa_gr1a, oa_gr1a, ol_gr1a, fna_gr1a, fnl_gr1a, nfna_gr1a, gp_gr1a, ebitda_gr1a, ebit_gr1a, ope_gr1a, ni_gr1a, nix_gr1a, dp_gr1a, fincf_gr1a, ocf_gr1a, fcf_gr1a, nwc_gr1a, eqnetis_gr1a, dltnetis_gr1a, dstnetis_gr1a, dbnetis_gr1a, netis_gr1a, eqnpo_gr1a, tax_gr1a, eqbb_gr1a, eqis_gr1a, div_gr1a, eqpo_gr1a, capx_gr1a, be_gr1a, cash_gr3a, inv_gr3a, rec_gr3a, ppeg_gr3a, lti_gr3a, intan_gr3a, debtst_gr3a, ap_gr3a, txp_gr3a, debtlt_gr3a, txditc_gr3a, coa_gr3a, col_gr3a, cowc_gr3a, ncoa_gr3a, ncol_gr3a, nncoa_gr3a, oa_gr3a, ol_gr3a, fna_gr3a, fnl_gr3a, nfna_gr3a, gp_gr3a, ebitda_gr3a, ebit_gr3a, ope_gr3a, ni_gr3a, nix_gr3a, dp_gr3a, fincf_gr3a, ocf_gr3a, fcf_gr3a, nwc_gr3a, eqnetis_gr3a, dltnetis_gr3a, dstnetis_gr3a, dbnetis_gr3a, netis_gr3a, eqnpo_gr3a, tax_gr3a, eqbb_gr3a, eqis_gr3a, div_gr3a, eqpo_gr3a, capx_gr3a, capx_at, rd_at, spi_at, xido_at, nri_at, gp_sale, ebitda_sale, ebit_sale, pi_sale, ni_sale, nix_sale, ocf_sale, fcf_sale, gp_at, ebitda_at, ebit_at, fi_at, cop_at, ni_at, ope_be, ni_be, nix_be, ocf_be, fcf_be, gp_bev, ebitda_bev, ebit_bev, fi_bev, cop_bev, gp_ppen, ebitda_ppen, fcf_ppen, fincf_at, netis_at, eqnetis_at, eqis_at, dbnetis_at, dltnetis_at, dstnetis_at, eqnpo_at, eqbb_at, div_at, oaccruals_at, oaccruals_ni, taccruals_at, taccruals_ni, noa_at, be_bev, debt_bev, cash_bev, pstk_bev, debtlt_bev, debtst_bev, int_debt, int_debtlt, ebitda_debt, profit_cl, ocf_cl, ocf_debt, cash_lt, inv_act, rec_act, debtst_debt, cl_lt, debtlt_debt, lt_ppen, debtlt_be, opex_at, nwc_at, fcf_ocf, debt_at, debt_be, ebit_int, inv_days, rec_days, ap_days, cash_conversion, cash_cl, caliq_cl, ca_cl, inv_turnover, at_turnover, rec_turnover, ap_turnover, adv_sale, staff_sale, sale_bev, rd_sale, sale_be, div_ni, sale_nwc, tax_pi, cash_at, ni_emp, sale_emp, sale_emp_gr1, emp_gr1, ni_inc8q, noa_gr1a, ppeinv_gr1a, lnoa_gr1a, capx_gr2, saleq_gr1, niq_be, niq_at, niq_be_chg1, niq_at_chg1, rd5_at, dsale_dinv, dsale_drec, dgp_dsale, dsale_dsga, saleq_su, niq_su, capex_abn, op_atl1, gp_atl1, ope_bel1, cop_atl1, pi_nix, ocf_at, op_at, ocf_at_chg1, at_be, ocfq_saleq_std, niq_saleq_std, roeq_be_std, roe_be_std, tangibility, earnings_variability, aliq_at, f_score, o_score, z_score, intrinsic_value, kz_index, gpoa_ch5, roe_ch5, roa_ch5, cfoa_ch5, gmar_ch5, ni_ar1, ni_ivol, at_me, be_me, debt_me, netdebt_me, cash_me, sale_me, gp_me, ebitda_me, ebit_me, ope_me, ni_me, nix_me, cop_me, ocf_me, fcf_me, div_me, eqbb_me, eqis_me, eqpo_me, eqnpo_me, eqnetis_me, rd_me, ival_me, at_mev, bev_mev, ppen_mev, be_mev, cash_mev, sale_mev, gp_mev, ebitda_mev, ebit_mev, cop_mev, ocf_mev, fcf_mev, debt_mev, pstk_mev, debtlt_mev, debtst_mev, dltnetis_mev, dstnetis_mev, dbnetis_mev, netis_mev, fincf_mev, aliq_mat, eq_dur, beta_60m, ivol_capm_60m, resff3_12_1, resff3_6_1, mispricing_mgmt, mispricing_perf, beta_21d, ivol_capm_21d, iskew_capm_21d, coskew_21d, beta_dimson_21d, ivol_ff3_21d, iskew_ff3_21d, ivol_hxz4_21d, iskew_hxz4_21d, rmax5_21d, rmax1_21d, rvol_21d, rskew_21d, zero_trades_21d, dolvol_126d, dolvol_var_126d, turnover_126d, turnover_var_126d, zero_trades_126d, zero_trades_252d, ami_126d, beta_252d, ivol_capm_252d, prc_highprc_252d, rvol_252d, betadown_252d, bidaskhl_21d, rvolhl_21d, corr_1260d, betabab_1260d, rmax5_rvol_21d, age, qmj, qmj_prof, qmj_growth, qmj_safety |
1 2 |
# The 153 factors used in the JKP-paper: niq_su, ret_6_1, ret_12_1, saleq_su, tax_gr1a, ni_inc8q, prc_highprc_252d, resff3_6_1, resff3_12_1, be_me, debt_me, at_me, ret_60_12, ni_me, fcf_me, div12m_me, eqpo_me, eqnpo_me, sale_gr3, sale_gr1, ebitda_mev, sale_me, ocf_me, ival_me, bev_mev, netdebt_me, eq_dur, capex_abn, at_gr1, ppeinv_gr1a, noa_at, noa_gr1a, lnoa_gr1a, capx_gr1, capx_gr2, capx_gr3, chcsho_12m, eqnpo_12m, debt_gr3, inv_gr1, inv_gr1a, oaccruals_at, taccruals_at, cowc_gr1a, coa_gr1a, col_gr1a, nncoa_gr1a, ncoa_gr1a, ncol_gr1a, nfna_gr1a, sti_gr1a, lti_gr1a, fnl_gr1a, be_gr1a, oaccruals_ni, taccruals_ni, netis_at, eqnetis_at, dbnetis_at, niq_be, niq_be_chg1, niq_at, niq_at_chg1, ebit_bev, ebit_sale, sale_bev, at_turnover, gp_at, gp_atl1, ope_be, ope_bel1, op_at, op_atl1, cop_at, cop_atl1, f_score, o_score, z_score, pi_nix, at_be, saleq_gr1, rd_me, rd_sale, opex_at, emp_gr1, rd5_at, age, dsale_dinv, dsale_drec, dgp_dsale, dsale_dsga, sale_emp_gr1, tangibility, kz_index, ocfq_saleq_std, cash_at, ni_ar1, ni_ivol, earnings_variability, aliq_at, aliq_mat, seas_1_1an, seas_1_1na, seas_2_5an, seas_2_5na, seas_6_10an, seas_6_10na, seas_11_15an, seas_11_15na, seas_16_20an, seas_16_20na, market_equity, ivol_ff3_21d, ivol_capm_252d, ivol_capm_21d, ivol_hxz4_21d, rvol_21d, beta_60m, betabab_1260d, beta_dimson_21d, turnover_126d, turnover_var_126d, dolvol_126d, dolvol_var_126d, prc, ami_126d, zero_trades_21d, zero_trades_126d, zero_trades_252d, rmax1_21d, rskew_21d, iskew_capm_21d, iskew_ff3_21d, iskew_hxz4_21d, coskew_21d, ret_1_0, betadown_252d, bidaskhl_21d, ret_3_1, ret_9_1, ret_12_7, corr_1260d, rmax5_21d, rmax5_rvol_21d, ni_be, ocf_at, ocf_at_chg1, mispricing_perf, mispricing_mgmt, qmj, qmj_prof, qmj_growth, qmj_safety |
1 2 |
# Countries in contrib.global_factors (incl 'mul'): 'ARE', 'ARG', 'AUS', 'AUT', 'BEL', 'BGD', 'BGR', 'BHR', 'BMU', 'BRA', 'BWA', 'CAN', 'CHE', 'CHL', 'CHN', 'CIV', 'COL', 'CYP', 'CZE', 'DEU', 'DNK', 'ECU', 'EGY', 'ESP', 'EST', 'FIN', 'FRA', 'GBR', 'GGY', 'GHA', 'GRC', 'HKG', 'HRV', 'HUN', 'IDN', 'IND', 'IRL', 'IRN', 'ISL', 'ISR', 'ITA', 'JAM', 'JOR', 'JPN', 'KAZ', 'KEN', 'KOR', 'KWT', 'LBN', 'LKA', 'LTU', 'LUX', 'LVA', 'MAR', 'MEX', 'MLT', 'mul', 'MUS', 'MWI', 'MYS', 'NAM', 'NGA', 'NLD', 'NOR', 'NZL', 'OMN', 'PAK', 'PER', 'PHL', 'POL', 'PRT', 'PSE', 'QAT', 'ROU', 'RUS', 'SAU', 'SGP', 'SRB', 'SVK', 'SVN', 'SWE', 'THA', 'TTO', 'TUN', 'TUR', 'TWN', 'TZA', 'UGA', 'UKR', 'URY', 'USA', 'VEN', 'VNM', 'ZAF', 'ZMB', 'ZWE' |