BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Ery Arias-Castro (UC San Diego)
DTSTART:20200417T150000Z
DTEND:20200417T161200Z
DTSTAMP:20260404T094554Z
UID:sss/1
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 /">On using graph distances to estimate Euclidean and related distances</a
 >\nby Ery Arias-Castro (UC San Diego) as part of Stochastics and Statistic
 s Seminar Series\n\n\nAbstract\nGraph distances have proven quite useful i
 n machine learning/statistics\, particularly in the estimation of Euclidea
 n or geodesic distances. The talk will include a partial review of the lit
 erature\, and then present more recent developments on the estimation of c
 urvature-constrained distances on a surface\, as well as on the estimation
  of Euclidean distances based on an unweighted and noisy neighborhood grap
 h.\n
LOCATION:https://stable.researchseminars.org/talk/sss/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sébastien Bubeck (Microsoft Research)
DTSTART:20200424T150000Z
DTEND:20200424T160000Z
DTSTAMP:20260404T094554Z
UID:sss/2
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 /">How to Trap a Gradient Flow</a>\nby Sébastien Bubeck (Microsoft Resear
 ch) as part of Stochastics and Statistics Seminar Series\n\n\nAbstract\nIn
  1993\, Stephen A. Vavasis proved that in any finite dimension\, there exi
 sts a faster method than gradient descent to find stationary points of smo
 oth non-convex functions. In dimension 2 he proved that 1/eps gradient que
 ries are enough\, and that 1/sqrt(eps) queries are necessary. We close thi
 s gap by providing an algorithm based on a new local-to-global phenomenon 
 for smooth non-convex functions. Some higher dimensional results will also
  be discussed. I will also present an extension of the 1/sqrt(eps) lower b
 ound to randomized algorithms\, mainly as an excuse to discuss some beauti
 ful topics such as Aldous’ 1983 paper on local minimization on the cube\
 , and Benjamini-Pemantle-Peres’ 1998 construction of unpredictable walks
 .\n\nJoint work with Dan Mikulincer\n
LOCATION:https://stable.researchseminars.org/talk/sss/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Alexandre d'Aspremont (ENS\, CNRS)
DTSTART:20200501T150000Z
DTEND:20200501T160000Z
DTSTAMP:20260404T094554Z
UID:sss/3
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/3
 /">Naive feature selection: Sparsity in naive Bayes</a>\nby Alexandre d'As
 premont (ENS\, CNRS) as part of Stochastics and Statistics Seminar Series\
 n\n\nAbstract\nDue to its linear complexity\, naive Bayes classification r
 emains an attractive supervised learning method\, especially in very large
 -scale settings. We propose a sparse version of naive Bayes\, which can be
  used for feature selection. This leads to a combinatorial maximum-likelih
 ood problem\, for which we provide an exact solution in the case of binary
  data\, or a bound in the multinomial case. We prove that our bound become
 s tight as the marginal contribution of additional features decreases. Bot
 h binary and multinomial sparse models are solvable in time almost linear 
 in problem size\, representing a very small extra relative cost compared t
 o the classical naive Bayes. Numerical experiments on text data show that 
 the naive Bayes feature selection method is as statistically effective as 
 state-of-the-art feature selection methods such as recursive feature elimi
 nation\, l1-penalized logistic regression and LASSO\, while being orders o
 f magnitude faster. For a large data set\, having more than with 1.6 milli
 on training points and about 12 million features\, and with a non-optimize
 d CPU implementation\, our sparse naive Bayes model can be trained in less
  than 15 seconds.  Authors: A. Askari\, A. d’Aspremont\, L. El Ghaoui.\n
LOCATION:https://stable.researchseminars.org/talk/sss/3/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Gesine Reinert (University of Oxford)
DTSTART:20200911T150000Z
DTEND:20200911T160000Z
DTSTAMP:20260404T094554Z
UID:sss/4
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/4
 /">Stein’s method for multivariate continuous distributions and applicat
 ions</a>\nby Gesine Reinert (University of Oxford) as part of Stochastics 
 and Statistics Seminar Series\n\n\nAbstract\nStein’s method is a key met
 hod for assessing distributional distance\, mainly for one-dimensional dis
 tributions. In this talk we provide a general approach to Stein’s method
  for multivariate continuous distributions. Among the applications we cons
 ider is the Wasserstein distance between two continuous probability distri
 butions under the assumption of existence of a Poincare constant.\n\nThis 
 is joint work with Guillaume Mijoule (INRIA Paris) and Yvik Swan (Liege).\
 n
LOCATION:https://stable.researchseminars.org/talk/sss/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Caroline Uhler (MIT)
DTSTART:20200918T150500Z
DTEND:20200918T160500Z
DTSTAMP:20260404T094554Z
UID:sss/5
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/5
 /">Causal Inference and Overparameterized Autoencoders in the Light of Dru
 g Repurposing for SARS-CoV-2</a>\nby Caroline Uhler (MIT) as part of Stoch
 astics and Statistics Seminar Series\n\n\nAbstract\nMassive data collectio
 n holds the promise of a better understanding of complex phenomena and ult
 imately\, of better decisions. An exciting opportunity in this regard stem
 s from the growing availability of perturbation / intervention data (drugs
 \, knockouts\, overexpression\, etc.) in biology. In order to obtain mecha
 nistic insights from such data\, a major challenge is the development of a
  framework that integrates observational and interventional data and allow
 s predicting the effect of yet unseen interventions or transporting the ef
 fect of interventions observed in one context to another. I will present a
  framework for causal inference based on such data and particularly highli
 ght the role of overparameterized autoencoders. We end by demonstrating ho
 w these ideas can be applied for drug repurposing in the current SARS-CoV-
 2 crisis.\n
LOCATION:https://stable.researchseminars.org/talk/sss/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dylan Foster (MIT)
DTSTART:20200925T150500Z
DTEND:20200925T160500Z
DTSTAMP:20260404T094554Z
UID:sss/6
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/6
 /">Separating Estimation from Decision Making in Contextual Bandits</a>\nb
 y Dylan Foster (MIT) as part of Stochastics and Statistics Seminar Series\
 n\n\nAbstract\nThe contextual bandit is a sequential decision making probl
 em in which a learner repeatedly selects an action (e.g.\, a news article 
 to display) in response to a context (e.g.\, a user’s profile) and recei
 ves a reward\, but only for the action they selected. Beyond the classic e
 xplore-exploit tradeoff\, a fundamental challenge in contextual bandits is
  to develop algorithms that can leverage flexible function approximation t
 o model similarity between contexts\, yet have computational requirements 
 comparable to classical supervised learning tasks such as classification a
 nd regression. To this end\, we provide the first universal and optimal re
 duction from contextual bandits to online regression. We show how to trans
 form any oracle for online regression with a given value function class in
 to an algorithm for contextual bandits with the induced policy class\, wit
 h no overhead in runtime or memory requirements. Conceptually\, our result
 s show that it is possible to provably separate estimation and decision ma
 king into separate algorithmic building blocks\, and that this can be effe
 ctive both in theory and in practice. Time permitting\, I will discuss ext
 ensions of these techniques to more challenging reinforcement learning pro
 blems.\n
LOCATION:https://stable.researchseminars.org/talk/sss/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Richard Nickl (University of Cambridge)
DTSTART:20201002T150500Z
DTEND:20201002T160500Z
DTSTAMP:20260404T094554Z
UID:sss/7
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/7
 /">Bayesian inverse problems\, Gaussian processes\, and partial differenti
 al equations</a>\nby Richard Nickl (University of Cambridge) as part of St
 ochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Gábor Lugosi (Pompeu Fabra University)
DTSTART:20201009T150500Z
DTEND:20201009T160500Z
DTSTAMP:20260404T094554Z
UID:sss/8
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/8
 /">On Estimating the Mean of a Random Vector</a>\nby Gábor Lugosi (Pompeu
  Fabra University) as part of Stochastics and Statistics Seminar Series\n\
 nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Carola-Bibiane Schönlieb (University of Cambridge)
DTSTART:20201016T150500Z
DTEND:20201016T160500Z
DTSTAMP:20260404T094554Z
UID:sss/9
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/9
 /">Data driven variational models for solving inverse problems</a>\nby Car
 ola-Bibiane Schönlieb (University of Cambridge) as part of Stochastics an
 d Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jose Blanchet (Stanford University)
DTSTART:20201023T150500Z
DTEND:20201023T160500Z
DTSTAMP:20260404T094554Z
UID:sss/10
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 0/">Statistical Aspects of Wasserstein Distributionally Robust Optimizatio
 n Estimators</a>\nby Jose Blanchet (Stanford University) as part of Stocha
 stics and Statistics Seminar Series\n\n\nAbstract\nAbstract: Wasserstein-b
 ased distributional robust optimization problems are formulated as min-max
  games in which a statistician chooses a parameter to minimize an expected
  loss against an adversary (say nature) which wishes to maximize the loss 
 by choosing an appropriate probability model within a certain non-parametr
 ic class. Recently\, these formulations have been studied in the context i
 n which the non-parametric class chosen by nature is defined as a Wasserst
 ein-distance neighborhood around the empirical measure. It turns out that 
 by appropriately choosing the loss and the geometry of the Wasserstein dis
 tance one can recover a wide range of classical statistical estimators (in
 cluding Lasso\, Graphical Lasso\, SVM\, group Lasso\, among many others). 
 This talk studies a wide range of rich statistical quantities associated w
 ith these problems\; for example\, the optimal (in a certain sense) choice
  of the adversarial perturbation\, weak convergence of natural confidence 
 regions associated with these formulations\, and asymptotic normality of t
 he DRO estimators. (This talk is based on joint work with Y. Kang\, K. Mur
 thy\, and N. Si.)\n
LOCATION:https://stable.researchseminars.org/talk/sss/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Daniela Witten (University of Washington)
DTSTART:20201106T160500Z
DTEND:20201106T170500Z
DTSTAMP:20260404T094554Z
UID:sss/12
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 2/">Valid hypothesis testing after hierarchical clustering</a>\nby Daniela
  Witten (University of Washington) as part of Stochastics and Statistics S
 eminar Series\n\n\nAbstract\nAs datasets continue to grow in size\, in man
 y settings the focus of data collection has shifted away from testing pre-
 specified hypotheses\, and towards hypothesis generation. Researchers are 
 often interested in performing an exploratory data analysis in order to ge
 nerate hypotheses\, and then testing those hypotheses on the same data\; I
  will refer to this as ‘double dipping’. Unfortunately\, double dippin
 g can lead to highly-inflated Type 1 errors. In this talk\, I will conside
 r the special case of hierarchical clustering. First\, I will show that sa
 mple–splitting does not solve the ‘double dipping’ problem for clust
 ering. Then\, I will propose a test for a difference in means between esti
 mated clusters that accounts for the cluster estimation process\, using a 
 selective inference framework. I will also show an application of this app
 roach to single-cell RNA-sequencing data. This is joint work with Lucy Gao
  (University of Waterloo) and Jacob Bien (University of Southern Californi
 a).\n
LOCATION:https://stable.researchseminars.org/talk/sss/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mary Wootters (Stanford University)
DTSTART:20201113T160500Z
DTEND:20201113T170500Z
DTSTAMP:20260404T094554Z
UID:sss/13
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 3/">Sharp Thresholds for Random Subspaces\, and Applications</a>\nby Mary 
 Wootters (Stanford University) as part of Stochastics and Statistics Semin
 ar Series\n\n\nAbstract\nAbstract: What combinatorial properties are likel
 y to be satisfied by a random subspace over a finite field? For example\, 
 is it likely that not too many points lie in any Hamming ball? What about 
 any cube?  We show that there is a sharp threshold on the dimension of the
  subspace at which the answers to these questions change from “extremely
  likely” to “extremely unlikely\,” and moreover we give a simple cha
 racterization of this threshold for different properties. Our motivation c
 omes from error correcting codes\, and we use this characterization to mak
 e progress on the questions of list-decoding and list-recovery for random 
 linear codes\, and also to establish the list-decodability of random Low D
 ensity Parity-Check (LDPC) codes.\n\nThis talk is based on the joint works
  with Venkatesan Guruswami\, Ray Li\, Jonathan Mosheiff\, Nicolas Resch\, 
 Noga Ron-Zewi\, and Shashwat Silas.\nEvent Navigation\n
LOCATION:https://stable.researchseminars.org/talk/sss/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Arnaud Doucet (University of Oxford)
DTSTART:20201120T160500Z
DTEND:20201120T170500Z
DTSTAMP:20260404T094554Z
UID:sss/14
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 4/">Perfect Simulation for Feynman-Kac Models using Ensemble Rejection Sam
 pling</a>\nby Arnaud Doucet (University of Oxford) as part of Stochastics 
 and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rong Ge (Duke University)
DTSTART:20201204T160500Z
DTEND:20201204T170500Z
DTSTAMP:20260404T094554Z
UID:sss/15
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 5/">A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neu
 ral Net</a>\nby Rong Ge (Duke University) as part of Stochastics and Stati
 stics Seminar Series\n\n\nAbstract\nThe training of neural networks optimi
 zes complex non-convex objective functions\, yet in practice simple algori
 thms achieve great performances. Recent works suggest that over-parametriz
 ation could be a key ingredient in explaining this discrepancy. However\, 
  current theories could not fully explain the role of over-parameterizatio
 n. In particular\, they either work in a regime where neurons don't move m
 uch\, or require large number of neurons. In this paper we develop a local
  convergence theory for mildly over-parameterized two-layer neural net. We
  show that as long as the loss is already lower than a threshold (polynomi
 al in relevant parameters)\, all student neurons in an over-parametrized t
 wo-layer neural network will converge to one of teacher neurons\, and the 
 loss will go to 0. Our result holds for any number of student neurons as l
 ong as it's at least as large as the number of teacher neurons\, and gives
  explicit bounds on convergence rates that is independent of the number of
  student neurons. Based on joint work with Mo Zhou and Chi Jin.\n
LOCATION:https://stable.researchseminars.org/talk/sss/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jerri Li (Microsoft Research)
DTSTART:20210219T160000Z
DTEND:20210219T171200Z
DTSTAMP:20260404T094554Z
UID:sss/16
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 6/">Faster and Simpler Algorithms for List Learning</a>\nby Jerri Li (Micr
 osoft Research) as part of Stochastics and Statistics Seminar Series\n\n\n
 Abstract\nThe goal of list learning is to understand how to learn basic st
 atistics of a dataset when it has been corrupted by an overwhelming fracti
 on of outliers. More formally\, one is given a set of points $S$\, of whic
 h an $\\alpha$-fraction $T$ are promised to be well-behaved. The goal is t
 hen to output an $O(1 / \\alpha)$ sized list of candidate means\, so that 
 one of these candidates is close to the true mean of the points in $T$. In
  many ways\, list learning can be thought of as the natural robust general
 ization of clustering mixture models. This formulation of the problem was 
 first proposed in Charikar-Steinhardt-Valiant STOC’17\, which gave the f
 irst polynomial-time algorithm which achieved nearly-optimal error guarant
 ees. More recently\, exciting work of Cherapanamjeri-Mohanty-Yau FOCS’20
  gave an algorithm which ran in time $\\widetilde{O} (n d \\mathrm{poly} (
 1 / \\alpha))$. In particular\, this runtime is nearly linear in the input
  size for $1/\\alpha = O(1)$\, however\, the runtime quickly becomes impra
 ctical for reasonably small $1/\\alpha$. Moreover\, both of these algorith
 ms are quite complicated.\n\nIn our work\, we have two main contributions.
  First\, we give a polynomial time algorithm for this problem which achiev
 es optimal error\, which is considerably simpler than the previously known
  algorithms. Second\, we then build off of these insights to develop a mor
 e sophisticated algorithm based on lazy mirror descent which runs in time 
 $\\widetilde{O}(n d / \\alpha + 1/\\alpha^6)$\, and which also achieves op
 timal error. Our algorithm improves upon the runtime of previous work for 
 all $1/\\alpha = O(sqrt(d))$. The goal of this talk is to give a more or l
 ess self-contained proof of the first\, and then explain at a high level h
 ow to use these ideas to develop our faster algorithm.\n\nJoint work with 
 Ilias Diakonikolas\, Daniel Kane\, Daniel Kongsgaard\, and Kevin Tian\n
LOCATION:https://stable.researchseminars.org/talk/sss/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yury Polyanskiy (MIT)
DTSTART:20210226T160000Z
DTEND:20210226T171200Z
DTSTAMP:20260404T094554Z
UID:sss/17
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 7/">Self-regularizing Property of Nonparametric Maximum Likelihood Estimat
 or in Mixture Models</a>\nby Yury Polyanskiy (MIT) as part of Stochastics 
 and Statistics Seminar Series\n\n\nAbstract\nIntroduced by Kiefer and Wolf
 owitz 1956\, the nonparametric maximum likelihood estimator (NPMLE) is a w
 idely used methodology for learning mixture models and empirical Bayes est
 imation. Sidestepping the non-convexity in mixture likelihood\, the NPMLE 
 estimates the mixing distribution by maximizing the total likelihood over 
 the space of probability measures\, which can be viewed as an extreme form
  of over parameterization.\n\nIn this work we discover a surprising proper
 ty of the NPMLE solution. Consider\, for example\, a Gaussian mixture mode
 l on the real line with a subgaussian mixing distribution. Leveraging comp
 lex-analytic techniques\, we show that with high probability the NPMLE bas
 ed on a sample of size n has O(\\log n) atoms (mass points)\, significantl
 y improving the deterministic upper bound of n due to Lindsay (1983). Nota
 bly\, any such Gaussian mixture is statistically indistinguishable from a 
 finite one with O(\\log n) components (and this is tight for certain mixtu
 res). Thus\, absent any explicit form of model selection\, NPMLE automatic
 ally chooses the right model complexity\, a property we term self-regulari
 zation. Extensions to other exponential families are given. As a statistic
 al application\, we show that this structural property can be harnessed to
  bootstrap existing Hellinger risk bound of the (parametric) MLE for finit
 e Gaussian mixtures to the NPMLE for general Gaussian mixtures\, recoverin
 g a result of Zhang (2009). Time permitting\, we will discuss connections 
 to approaching the optimal regret in empirical Bayes. This is based on joi
 nt work with Yihong Wu (Yale).\n
LOCATION:https://stable.researchseminars.org/talk/sss/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Bhaswar B. Bhattacharya (bhaswar@wharton.upenn.edu)
DTSTART:20210305T160000Z
DTEND:20210305T171200Z
DTSTAMP:20260404T094554Z
UID:sss/18
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 8/">Detection Thresholds for Distribution-Free Non-Parametric Tests: The C
 urious Case of Dimension 8</a>\nby Bhaswar B. Bhattacharya (bhaswar@wharto
 n.upenn.edu) as part of Stochastics and Statistics Seminar Series\n\nAbstr
 act: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:James Robins (Harvard)
DTSTART:20210312T160000Z
DTEND:20210312T171200Z
DTSTAMP:20260404T094554Z
UID:sss/19
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/1
 9/">On nearly assumption-free tests of nominal confidence interval coverag
 e for causal parameters estimated by machine learning</a>\nby James Robins
  (Harvard) as part of Stochastics and Statistics Seminar Series\n\nAbstrac
 t: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Daniel Roy (University of Toronto)
DTSTART:20210319T150000Z
DTEND:20210319T161200Z
DTSTAMP:20260404T094554Z
UID:sss/20
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 0/">Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via 
 Root-Entropic Regularization</a>\nby Daniel Roy (University of Toronto) as
  part of Stochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Vladimir Vovk (Royal Holloway\, University of London)
DTSTART:20210326T150000Z
DTEND:20210326T161200Z
DTSTAMP:20260404T094554Z
UID:sss/21
DESCRIPTION:by Vladimir Vovk (Royal Holloway\, University of London) as pa
 rt of Stochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Thibaut Le Gouic (MIT)
DTSTART:20210402T150000Z
DTEND:20210402T161200Z
DTSTAMP:20260404T094554Z
UID:sss/22
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 2/">Sampler for the Wasserstein barycenter</a>\nby Thibaut Le Gouic (MIT) 
 as part of Stochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Suriya Gunasekar (Microsoft Research)
DTSTART:20210409T150000Z
DTEND:20210409T161200Z
DTSTAMP:20260404T094554Z
UID:sss/23
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 3/">Functions space view of linear multi-channel convolution networks with
  bounded weight norm</a>\nby Suriya Gunasekar (Microsoft Research) as part
  of Stochastics and Statistics Seminar Series\n\n\nAbstract\nThe magnitude
  of the weights of a neural network is a fundamental measure of complexity
  that plays a crucial role in the study of implicit and explicit regulariz
 ation. For example\, in recent work\, gradient descent updates in overpara
 meterized models asymptotically lead to solutions that implicitly minimize
  the ell_2 norm of the parameters of the model\, resulting in an inductive
  bias that is highly architecture dependent. To investigate the properties
  of learned functions\, it is natural to consider a function space view gi
 ven by the minimum ell_2 norm of weights required to realize a given funct
 ion with a given network. We call this the “induced regularizer” of th
 e network. Building on a line of recent work\, we study the induced regula
 rizer of linear convolutional neural networks with a focus on the role of 
 kernel size and the number of channels. We introduce an SDP relaxation of 
 the induced regularizer\, that we show is tight for networks with a single
  input channel. Using this SDP formulation\, we show that the induced regu
 larizer is independent of the number of the output channels for single-inp
 ut channel networks\, and for multi-input channel networks\, we show indep
 endence given sufficiently many output channels. Moreover\, we show that a
 s the kernel size increases\, the induced regularizer interpolates between
  a basis-invariant norm and a basis-dependent norm that promotes sparse st
 ructures in Fourier space.\n\nBased on joint work with Meena Jagadeesan an
 d Ilya Razenshteyn.\n
LOCATION:https://stable.researchseminars.org/talk/sss/23/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Eric Laber (Duke University)
DTSTART:20210416T150000Z
DTEND:20210416T161200Z
DTSTAMP:20260404T094554Z
UID:sss/24
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 4/">Sample size considerations in precision medicine</a>\nby Eric Laber (D
 uke University) as part of Stochastics and Statistics Seminar Series\n\nAb
 stract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/24/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Hilary Finucane (Broad Institute)
DTSTART:20210423T150000Z
DTEND:20210423T161200Z
DTSTAMP:20260404T094554Z
UID:sss/25
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 5/">Prioritizing genes from genome-wide association studies</a>\nby Hilary
  Finucane (Broad Institute) as part of Stochastics and Statistics Seminar 
 Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/25/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ann Lee (Carnegie Mellon University)
DTSTART:20210514T150000Z
DTEND:20210514T161200Z
DTSTAMP:20260404T094554Z
UID:sss/26
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/sss/2
 6/">Likelihood-Free Frequentist Inference</a>\nby Ann Lee (Carnegie Mellon
  University) as part of Stochastics and Statistics Seminar Series\n\nAbstr
 act: TBA\n
LOCATION:https://stable.researchseminars.org/talk/sss/26/
END:VEVENT
END:VCALENDAR
