BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Alexandre d'Aspremont (ENS\, CNRS)
DTSTART:20200501T150000Z
DTEND:20200501T160000Z
DTSTAMP:20260404T132229Z
UID:sss/3
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/3
 /">Naive feature selection: Sparsity in naive Bayes</a>\nby Alexandre d'As
 premont (ENS\, CNRS) as part of Stochastics and Statistics Seminar Series\
 n\n\nAbstract\nDue to its linear complexity\, naive Bayes classification r
 emains an attractive supervised learning method\, especially in very large
 -scale settings. We propose a sparse version of naive Bayes\, which can be
  used for feature selection. This leads to a combinatorial maximum-likelih
 ood problem\, for which we provide an exact solution in the case of binary
  data\, or a bound in the multinomial case. We prove that our bound become
 s tight as the marginal contribution of additional features decreases. Bot
 h binary and multinomial sparse models are solvable in time almost linear 
 in problem size\, representing a very small extra relative cost compared t
 o the classical naive Bayes. Numerical experiments on text data show that 
 the naive Bayes feature selection method is as statistically effective as 
 state-of-the-art feature selection methods such as recursive feature elimi
 nation\, l1-penalized logistic regression and LASSO\, while being orders o
 f magnitude faster. For a large data set\, having more than with 1.6 milli
 on training points and about 12 million features\, and with a non-optimize
 d CPU implementation\, our sparse naive Bayes model can be trained in less
  than 15 seconds.  Authors: A. Askari\, A. d’Aspremont\, L. El Ghaoui.\n
LOCATION:https://stable.researchseminars.org/talk/sss/3/
END:VEVENT
END:VCALENDAR
