Gal A. Kaminka: Publications

• Sorted by Date • Classified by Publication Type • Classified by Topic • Grouped by Student (current) • Grouped by Former Students •

Learning User Boredom Constraints in Sequential Recommender Systems

Noam Zvi and Gal A. Kaminka. Learning User Boredom Constraints in Sequential Recommender Systems . In Proceedings of the AAMAS Workshop on Adaptive and Learning Agents (ALA), 2026.

Download

[PDF]859.6kB

Abstract

We consider sequential recommender systems that work in multiple sessions, with a fixed catalog. Each session opens with a single recommendation. Acceptance leads to another recommendation. The session ends upon first rejection. The goal is to maximize session length. Myopic exploitation of previously-successful recommendations quickly leads to user boredom. We introduce novel bandit algorithms that improve recommendation variety by learning and enforcing per-user, per-item boredom thresholds. This allows repeated recommendations, appropriately spaced in time, with a high acceptance rate. Learning takes place in two stages: (i) item-specific boredom thresholds are determined; (ii) once the thresholds are known, preference for the item is learned via a standard bandit algorithm. Evaluation using user data from a commercial system demonstrates clear improvements in session length.

BibTeX

@inproceedings{ala26ws-noam,
		title = {	Learning User Boredom Constraints in Sequential Recommender Systems },
		author = {Noam Zvi and Gal A. Kaminka},
		booktitle = {Proceedings of the {AAMAS} Workshop on Adaptive and Learning Agents ({ALA})},
  	year = {2026},
  	abstract = { We consider sequential recommender systems that work in multiple sessions, with a fixed catalog. Each session opens with a single recommendation. Acceptance leads to another recommendation. The session ends upon first rejection. The goal is to maximize session length. Myopic exploitation of previously-successful recommendations quickly leads to user boredom. We introduce novel bandit algorithms that improve recommendation variety by learning and enforcing per-user, per-item boredom thresholds. This allows repeated recommendations, appropriately spaced in time, with a high acceptance rate. Learning takes place in two stages: (i) item-specific boredom thresholds are determined; (ii) once the thresholds are known, preference for the item is learned via a standard bandit algorithm. Evaluation using user data from a commercial system demonstrates clear improvements in session length.
    },
}

Generated by bib2html.pl (written by Patrick Riley ) on Sun Jul 12, 2026 20:07:08