Reinforcement Learning Based Demand State Modeling and Dynamic Personalization for Digital Retail

Authors

  • Ying Lin Northern Arizona University, Flagstaff, Arizona, USA Author
  • Yao Yao Capinfo Cloud Tech Company Limited, Beijing, China Author
  • Chao He Data Science Department, Amazon China, Beijing, China Author

DOI:

https://doi.org/10.71465/fbf654

Keywords:

Reinforcement Learning, Sequential Decision Making, Demand State Modeling, Dynamic Personalization, Marketing Fatigue, Customer Lifetime Value, Offline Policy Learning

Abstract

Personalization in digital retail involves repeated interventions whose effects unfold over time, with delayed purchase responses and diminishing returns caused by excessive contact. To address this setting, personalized strategy selection is formulated as a Markov decision process in which customer demand evolves across daily decision epochs. Multi-modal behavioral sequences—browsing, add-to-cart, purchase, return, and responses to marketing touchpoints—are encoded into latent demand states using representation learning to obtain compact state vectors from high-dimensional logs. Strategic actions are defined over promotion intensity, content type, and contact frequency, and the learning objective combines conversion revenue with explicit penalties for promotion cost and fatigue-related negative responses. Experiments use a 12-month dataset containing 960,418,327 behavioral records from 2,104,673 customers, trained offline and evaluated through online playback under identical operational constraints across baselines. Relative to rule-based and supervised alternatives, the learned policy improves conversion from 3.72% to 3.88% (+4.3% relative) and increases 30-day LTV from $12.41 to $13.25 (+6.8% relative). Ineffective outreach is reduced, with the share of contacted instances without attributed purchase decreasing from 78.6% to 74.1%. The study support sequence-level optimization of interventions when cost and fatigue must be controlled alongside revenue.

Downloads

Download data is not yet available.

Downloads

Published

2026-02-26