publications | Victor Thuot

2025

ALT
Clustering with bandit feedback: breaking down the computation/information gap

Victor Thuot, Alexandra Carpentier, Christophe Giraud, and Nicolas Verzelen

In Proceedings of The 36th International Conference on Algorithmic Learning Theory , Feb 2025

Abs Bib PDF Video Poster Slides

We investigate the Clustering with Bandit feedback Problem (CBP). A learner interacts with an N-armed stochastic bandit with d-dimensional subGaussian feedback. There exists a hidden partition of the arms into K groups,such that arms within the same group, share the same mean vector. The learner’s task is to uncover this hidden partition with the smallest budget - i.e. the least number of observation - and with a probability of error smaller than a prescribed constant δ. In this paper, (i) we derive a non asymptotic lower bound for the budget, and (ii) we introduce the computationally efficient ACB algorithm, whose budget matches the lower bound in most regimes. We improve on the performance of a uniform sampling strategy. Importantly, contrary to the batch setting, we establish that there is no computation-information gap in the bandit setting.
@inproceedings{pmlr-v272-thuot25a, title = {Clustering with bandit feedback: breaking down the computation/information gap}, author = {Thuot, Victor and Carpentier, Alexandra and Giraud, Christophe and Verzelen, Nicolas}, booktitle = {Proceedings of The 36th International Conference on Algorithmic Learning Theory}, pages = {1221--1284}, year = {2025}, editor = {Kamath, Gautam and Loh, Po-Ling}, volume = {272}, series = {Proceedings of Machine Learning Research}, month = feb, publisher = {PMLR}, url = {https://proceedings.mlr.press/v272/thuot25a.html}, }
ICML
Clustering Items through Bandit Feedback: Finding the Right Feature out of Many

Maximilian Graf, Victor Thuot, and Nicolas Verzelen

In Proceedings of The forty-second International Conference on Machine Learning , Jul 2025

Abs Bib PDF Poster

We study the problem of clustering a set of items based on bandit feedback. Each of the n items is characterized by a feature vector, with a possibly large dimension d. The items are partitioned into two unknown groups, such that items within the same group share the same feature vector. We consider a sequential and adaptive setting in which, at each round, the learner selects one item and one feature, then observes a noisy evaluation of the item’s feature. The learner’s objective is to recover the correct partition of the items, while keeping the number of observations as small as possible. We provide an algorithm which relies on finding a relevant feature for the clustering task, leveraging the Sequential Halving algorithm. With probability at least 1 − δ, we obtain an accurate recovery of the partition and derive an upper bound on the budget required. Furthermore, we obtain an instance-dependent lower bound, which is tight in some relevant cases.
@inproceedings{graf2025clustering, title = {{Clustering Items through Bandit Feedback: Finding the Right Feature out of Many}}, author = {Graf, Maximilian and Thuot, Victor and Verzelen, Nicolas}, url = {https://openreview.net/forum?id=99zsyZpUqp}, year = {2025}, month = jul, booktitle = {Proceedings of The forty-second International Conference on Machine Learning}, }