• Sorted by Date • Classified by Publication Type • Classified by Topic • Grouped by Student (current) • Grouped by Former Students •
Noam Zvi and Gal A. Kaminka. Learning User Boredom Constraints in Sequential Recommender Systems . In Proceedings of the AAMAS Workshop on Adaptive and Learning Agents (ALA), 2026.
(unavailable)
We consider sequential recommender systems that work in multiple sessions, with a fixed catalog. Each session opens with a single recommendation. Acceptance leads to another recommendation. The session ends upon first rejection. The goal is to maximize session length. Myopic exploitation of previously-successful recommendations quickly leads to user boredom. We introduce novel bandit algorithms that improve recommendation variety by learning and enforcing per-user, per-item boredom thresholds. This allows repeated recommendations, appropriately spaced in time, with a high acceptance rate. Learning takes place in two stages: (i) item-specific boredom thresholds are determined; (ii) once the thresholds are known, preference for the item is learned via a standard bandit algorithm. Evaluation using user data from a commercial system demonstrates clear improvements in session length.
@inproceedings{ala26ws-noam,
title = { Learning User Boredom Constraints in Sequential Recommender Systems },
author = {Noam Zvi and Gal A. Kaminka},
booktitle = {Proceedings of the {AAMAS} Workshop on Adaptive and Learning Agents ({ALA})},
year = {2026},
abstract = { We consider sequential recommender systems that work in multiple sessions, with a fixed catalog. Each session opens with a single recommendation. Acceptance leads to another recommendation. The session ends upon first rejection. The goal is to maximize session length. Myopic exploitation of previously-successful recommendations quickly leads to user boredom. We introduce novel bandit algorithms that improve recommendation variety by learning and enforcing per-user, per-item boredom thresholds. This allows repeated recommendations, appropriately spaced in time, with a high acceptance rate. Learning takes place in two stages: (i) item-specific boredom thresholds are determined; (ii) once the thresholds are known, preference for the item is learned via a standard bandit algorithm. Evaluation using user data from a commercial system demonstrates clear improvements in session length.
},
}
Generated by bib2html.pl (written by Patrick Riley ) on Mon Apr 06, 2026 23:33:57