October 5, 2010, 14:00–15:30
Room MF 323
Random forests are a scheme proposed by Leo Breiman in the 00's for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite a growing interest and emphasis on practice guidelines, there has been few explorations of the statistical properties of random forests, and little is known on the mathematical forces driving the algorithm. In this talk, we offer an in-depth analysis of a random forests model suggested by Breiman in 04, which is very close to the authentic algorithm. We show in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present.