Divide and Contrast: A Text-Based Method for Firm Market Risk Prediction
Abstrak
Forecasting the market risk for publicly traded companies is a critical task for market participants. Financial economics research demonstrates that the textual information contained in corporate disclosures, such as earnings conference call transcripts, can effectively predict a firm’s future risk. This finding has inspired a growing body of research focused specifically on transcript-based approaches to risk forecasting. However, earnings transcripts are typically long documents with thousands of words. Prior transcript-based risk forecasting studies that represent the entire transcript as one text sequence often fail to capture risk-relevant information and fall short in risk forecasting. In this work, we propose a novel divide-and-contrast machine learning method for predicting risks from earnings conference call transcripts. We exploit the semistructured nature of an earnings transcript and decompose it into several semantically coherent conversation units, ranging from the finest grained question–answer pair level to the coarsest grained transcript level. We then propose contrastive learning objectives as an auxiliary task to the risk forecasting objective, facilitating the learning of risk-relevant information from the earnings transcripts. We conduct experiments on a data set of U.S. market earnings call transcripts. The experimental results show that our proposed divide-and-contrast method substantially outperforms state-of-the-art methods by significantly reducing errors in risk forecasting. This paper sheds light on extracting informative insights from lengthy financial documents to support informed decision making. History: Accepted by Ram Ramesh, Area Editor for Data Science and Machine Learning. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2023.0195 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2023.0195 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
Penulis (4)
Yi He
Yi Yang
Defu Lian
Kunpeng Zhang
Akses Cepat
- Tahun Terbit
- 2025
- Bahasa
- en
- Sumber Database
- Semantic Scholar
- DOI
- 10.1287/ijoc.2023.0195
- Akses
- Open Access ✓