ITERATIVE FEATURE GENERATION METHOD BASED ON ARTIFICIAL INTELLIGENCE FOR CHURN PREDICTION

DOI: 10.31673/2786-8362.2025.024380

Authors

  • М. С. Шаш, (Shash M.S.) State University of Information and Communication Technologies, Kyiv
  • О. С. Звенигородський, (Zvenyhorodskyi O.S.) State University of Information and Communication Technologies, Kyiv

DOI:

https://doi.org/10.31673/2786-8362.2025.024380

Abstract

The article proposes an iterative feature generation method using large
language models (LLMs) to improve the efficiency of customer churn prediction in SaaS platforms. The
proposed approach combines an LLM generator and an LLM critic in a closed feedback loop, enabling
multi-step refinement and selection of relevant features based on model performance metrics. Experimental
results on streaming platform data demonstrated an increase in the F1-score compared to the baseline
approach. The obtained experiment results confirm the effectiveness of the iterative use of LLMs for
automating feature creation and enhancing the accuracy of churn prediction models.
Keywords: churn prediction, large language models, feature generation, machine learning, artificial
intelligence

References
1. Suguna R., Suriya P., Pai H. A. et al. Mitigating class imbalance in churn prediction with
ensemble methods and SMOTE. Scientific Reports. 2025. Vol. 15. P. 16256. URL:
https://doi.org/10.1038/s41598-025-01031-0.
2. Noviandy T. R., Idroes G. M., Hardi I. et al. A model-agnostic interpretability approach to
predicting customer churn in the telecommunications industry. Infolitika Journal of Data Science.
2024. Vol. 2, no. 1. P. 34–44. URL: https://doi.org/10.60084/ijds.v2i1.199.
3. ChurnKB: A Generative AI-Enriched Knowledge Base for Customer Churn Feature
Engineering. Algorithms. 2025. Vol. 18, no. 4. P. 238. URL: https://doi.org/10.3390/a18040238.
4. Large Language Models Can Automatically Engineer Features for Few-Shot Tabular
Learning. arXiv preprint. 2024. URL: https://arxiv.org/abs/2404.09491.
5. Imani M., Joudaki M., Beikmohammadi A., Arabnia H. R. Customer churn prediction: a
systematic review of recent advances, trends, and challenges in machine learning and deep learning.
Machine Learning and Knowledge Extraction. 2025. Vol. 7, no. 3. P. 105. URL:
https://doi.org/10.3390/make7030105.
6. Hegselmann S., Buendia A., Lang H., Agrawal M., Jiang X., Sontag D. TabLLM: Few-shot
classification of tabular data with large language models. arXiv preprint. 2022. URL:
https://doi.org/10.48550/arXiv.2210.10723.
7. Gong N., Wang X., Ying W., Bai H., Dong S., Chen H., Fu Y. Unsupervised feature
transformation via in-context generation, generator-critic LLM agents, and duet-play teaming. arXiv
preprint. 2025. URL: https://arxiv.org/abs/2504.21304.
8. Zhao H., Chen H., Yang F., Liu N., Deng H., Cai H., Wang S., Yin D., Du M. Explainability
for large language models: a survey. ACM Transactions on Intelligent Systems and Technology. 2024.
Vol. 15, no. 2. URL: https://doi.org/10.1145/3639372.
9. Zhang X., Zhang J., Rekabdar B., Zhou Y., Wang P., Liu K. Dynamic and adaptive feature
generation with LLM. arXiv preprint. 2024. URL: https://arxiv.org/abs/2406.03505.
10. Kaggle dataset: Predictive Analytics for Customer Churn Dataset. URL:
https://www.kaggle.com/datasets/safrin03/predictive-analytics-for-customer-churn-dataset.

Published

2026-01-19

Issue

Section

Articles