Over the last few decades, asset pricing literature has uncovered numerous equity factors, such as low risk, momentum and value, that explain cross-sectional differences in stock returns. The empirical evidence presented in support of these findings has largely relied on the Center for Research in Security Prices (CRSP) database, which houses US stock data – including returns – dating all the way back to 1926.
This sample period has been so intensively analyzed that many experts have warned that studies on factors could potentially be plagued by data dredging or p-hacking effects.1 In other words, many of the factors that seem important in-sample could lose explanatory power, or even fail to hold up out-of-sample. This issue can be addressed with a truly independent and sufficiently large dataset that can be used for out-of-sample testing.
Constructing a novel database
Regarding the latter, Guido Baltussen, Bart van Vliet and Pim van Vliet (from our Quantitative Investing team), in collaboration with the Erasmus University, have constructed a novel US stock database for the period 1866 to 1926, containing stock prices, dividend yields and market capitalization values. This huge effort, spanning over several years, entailed the hand-collection of market capitalization data, double-checking of all inputs, as well as data cleaning and adjustments for stock delistings and stock splits using digitalized financial journals. The team then merged this information with data from an external data provider – Global Financial Data – for the same period.
This ‘pre-CRSP’ sample period is of similar length to the one used in existing CRSP-based studies (61-years), and covers an economically important period that is independent to prevailing datasets. This era was characterized by strong economic growth and rapid industrial development, laying the foundations for the preeminence of the US economy. Meanwhile, the US stock market played a pivotal role in economic growth and the financing of key innovations during this phase.
The novel database provides new ground for independent tests, that can allow us to better understand return drivers and stock prices. The authors used the data to examine the cross-section of US stock returns over the pre-CRSP period in their research.2 This focused on well-documented stock characteristics, namely beta, momentum (12-1 month price momentum), short-term reversal (1-month), size and value (dividend yield).
Evidence of equity factor premiums pre-1926
The analysis started with Fama-MacBeth regressions3 and univariate portfolio sorts on the dataset. The authors found that market beta was not priced and the capital asset pricing model (CAPM) largely failed to explain asset prices, as low-beta stocks generated positive alpha and high-beta stocks delivered negative alpha. Furthermore, momentum and value exhibited significant premiums and return spreads. By contrast, size failed to do so on both counts, while short-term reversal displayed a significant premium but yielded an insignificant return spread.
The authors then built market-neutral and size-neutral factor portfolios, by double-sorting on size and a specific factor characteristic. They observed economically substantial and statistically significant premiums and CAPM alphas for low-risk (beta), momentum and value (dividend yield), while the size premium was again insignificant for both measures. In terms of short-term reversal, they saw significant premiums but insignificant CAPM alphas. The main results are summarized in Figures 1 and 2.
Figure 1 | Return spread (%), for the periods 1866 to 1926 and 1927 to 2019
Source: Robeco Quantitative Research. The figure shows the average annualized CAPM alphas for the size, value, momentum, short-term reversal and beta factors for the pre-CRSP and CRSP samples. Factors are constructed from top-bottom portfolios from 2x3 size-characteristic-based portfolios. The pre-CRSP sample starts in January 1866 and ends December 1926. The CRSP sample runs between January 1927 and December 2019. Performance is measured on a monthly frequency.
Figure 2 | CAPM alpha (%), for the periods 1866 to 1926 and 1927 to 2019
Overall, there was no material out-of-sample decay in factor premiums, as they were broadly similar in both the pre-1926 and post-1926 eras. The authors also confirmed that these results were generally robust over time, while different testing choices held up across industries and exchanges. Moreover, factor spanning tests revealed that low-risk, momentum, short-term reversal and value are non-redundant asset pricing factors, while size is subsumed by other factors. This indicates that low-risk, momentum and value are durable asset pricing factors.
Overall, there was no material out-of-sample decay in factor premiums, as they were broadly similar in both the pre-1926 and post-1926 eras. The authors also confirmed that these results were generally robust over time, while different testing choices held up across industries and exchanges. Moreover, factor spanning tests revealed that low-risk, momentum, short-term reversal and value are non-redundant asset pricing factors, while size is subsumed by other factors. This indicates that low-risk, momentum and value are durable asset pricing factors.
Machine learning techniques offer valuable insight on stock returns
The authors also conducted an out-of-sample test of machine learning (ML) methods, that had previously been successfully applied in the asset pricing literature. For example, some researchers4have argued that cross-sectional regressions and portfolio sorts can miss important dynamics and interactions between variables, such as return volatility and price momentum. These researchers found that ML models (random forests and neural networks that allow for nonlinear predictor interactions) could predict cross-sectional differences in stock returns over the period 1957 to 2016.
However, this sample period coincides with the CRSP era. And ultimately, ML models also require out-of-sample testing in independent samples, similar to traditional factor tests. The authors therefore applied the most promising ML techniques (random forest and neural network models) to the new 61-year sample period. They noted that the ML methods also worked in the pre-CRSP stage, as both models delivered significant CAPM alphas. As such, the research outlines that ML tools offer valuable information in terms of understanding the cross-section of stock returns.
In conclusion, this deep historical research underlines that factor premiums are not very dependent on specific market regimes, nor specific market structures. Instead, they are probably an ‘eternal’ feature of financial markets.
Footnotes
1 See: Harvey, C. R., July 2017, “Presidential address: the scientific outlook in financial economics”, Journal of Finance.
2 See: Baltussen, G., Van Vliet, B. P., and Van Vliet, P., November 2021, “The cross-section of stock returns before 1926 (and beyond)”, working paper.
3 See: Fama, E. F., and MacBeth, J. D., June 1973, “Risk, return, and equilibrium: empirical tests”, Journal of Political Economy.
4 See: Gu, S., Kelly, B., and Xiu, D., February 2020, “Empirical asset pricing via machine learning”, The Review of Financial Studies.
免責聲明
本文由荷宝海外投资基金管理(上海)有限公司(“荷宝上海”)编制, 本文内容仅供参考, 并不构成荷宝上海对任何人的购买或出售任何产品的建议、专业意见、要约、招揽或邀请。本文不应被视为对购买或出售任何投资产品的推荐或采用任何投资策略的建议。本文中的任何内容不得被视为有关法律、税务或投资方面的咨询, 也不表示任何投资或策略适合您的个人情况, 或以其他方式构成对您个人的推荐。 本文中所包含的信息和/或分析系根据荷宝上海所认为的可信渠道而获得的信息准备而成。荷宝上海不就其准确性、正确性、实用性或完整性作出任何陈述, 也不对因使用本文中的信息和/或分析而造成的损失承担任何责任。荷宝上海或其他任何关联机构及其董事、高级管理人员、员工均不对任何人因其依据本文所含信息而造成的任何直接或间接的损失或损害或任何其他后果承担责任或义务。 本文包含一些有关于未来业务、目标、管理纪律或其他方面的前瞻性陈述与预测, 这些陈述含有假设、风险和不确定性, 且是建立在截止到本文编写之日已有的信息之上。基于此, 我们不能保证这些前瞻性情况都会发生, 实际情况可能会与本文中的陈述具有一定的差别。我们不能保证本文中的统计信息在任何特定条件下都是准确、适当和完整的, 亦不能保证这些统计信息以及据以得出这些信息的假设能够反映荷宝上海可能遇到的市场条件或未来表现。本文中的信息是基于当前的市场情况, 这很有可能因随后的市场事件或其他原因而发生变化, 本文内容可能因此未反映最新情况,荷宝上海不负责更新本文, 或对本文中不准确或遗漏之信息进行纠正。