%0 Journal Article %T Estimating the joint distribution of independent categorical variables via model selection %A C. Durot %A E. Lebarbier %A A. -S. Tocquet %J Mathematics %D 2009 %I arXiv %R 10.3150/08-BEJ155 %X Assume one observes independent categorical variables or, equivalently, one observes the corresponding multinomial variables. Estimating the distribution of the observed sequence amounts to estimating the expectation of the multinomial sequence. A new estimator for this mean is proposed that is nonparametric, non-asymptotic and implementable even for large sequences. It is a penalized least-squares estimator based on wavelets, with a penalization term inspired by papers of Birg\'{e} and Massart. The estimator is proved to satisfy an oracle inequality and to be adaptive in the minimax sense over a class of Besov bodies. The method is embedded in a general framework which allows us to recover also an existing method for segmentation. Beyond theoretical results, a simulation study is reported and an application on real data is provided. %U http://arxiv.org/abs/0906.2275v1