Financial Report Entity-Relation Extraction Combining BiLSTM-CRF and FGM Adversarial Training

Authors

  • Robert J. Miller Department of Computer Science, the University of Tokyo, Tokyo 113-8654, Japan Author
  • Sarah E. Hamilton Department of Computer Science, the University of Tokyo, Tokyo 113-8654, Japan Author

DOI:

https://doi.org/10.71465/fbf693

Keywords:

Financial Text Mining, Entity Extraction, BiLSTM-CRF, Adversarial Training, Fast Gradient Method

Abstract

The rapid digitization of financial markets has resulted in an explosion of unstructured financial data, primarily in the form of textual reports, earnings calls, and regulatory filings. Extracting structured knowledge from these documents is critical for automated risk assessment, quantitative investment strategies, and market surveillance. However, financial texts differ significantly from general domain corpora due to high terminological density, complex nested sentence structures, and subtle semantic dependencies. Traditional named entity recognition and relation extraction models often suffer from overfitting when applied to limited labeled financial datasets, leading to poor generalization on unseen data. This paper proposes a robust entity-relation extraction framework that integrates a Bidirectional Long Short-Term Memory (BiLSTM) network with a Conditional Random Field (CRF) layer, augmented by Fast Gradient Method (FGM) adversarial training. The BiLSTM layer captures long-range semantic dependencies, while the CRF layer ensures the validity of the predicted tag sequences. Crucially, the FGM adversarial training mechanism introduces perturbations to the embedding layer during training, effectively regularizing the model and enhancing its robustness against noise and data sparsity. Experimental results demonstrate that the proposed model achieves superior performance in terms of precision, recall, and F1-score compared to baseline methods, particularly in identifying complex financial entities and their interrelations.

Downloads

Download data is not yet available.

Downloads

Published

2026-02-20