Building and Researching a Machine Learning Model for Identifying Corporate Tax Avoidance

Authors

  • Zhipeng Wang College of Computer Science and Engineering, Chengdu University of Technology, Chengdu 610059, China. Author
  • Yufei Chen College of Computer Science and Engineering, Chengdu University of Technology, Chengdu 610059, China. Author

DOI:

https://doi.org/10.71465/fbf399

Keywords:

Tax Avoidance Detection, Machine Learning, Corporate Governance, Predictive Modeling

Abstract

Corporate tax avoidance poses significant challenges to public finance and economic equity, yet its detection remains complex due to the nuanced nature of financial and non-financial corporate data. This study aims to develop and evaluate a machine learning model capable of accurately identifying corporate tax avoidance behaviors using a multi-dimensional dataset. The methodology incorporates financial ratios, ownership structures, and industry characteristics as predictive features, employing gradient boosting algorithms to classify firms based on their likelihood of engaging in tax avoidance. The model was trained and validated on a global dataset comprising over 10,000 publicly listed companies from 2000 to 2020. Key findings indicate that the model achieves an F1-score of 0.87, significantly outperforming traditional logistic regression benchmarks. Additionally, feature importance analysis reveals that profitability metrics, subsidiary networks, and jurisdictional attributes are among the most influential predictors. These results underscore the potential of machine learning to enhance regulatory oversight and inform policy-making by enabling early and precise identification of tax avoidance practices. 

Downloads

Download data is not yet available.

Downloads

Published

2025-10-23