BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Maxim Raginsky (University of Illinois Urbana-Champaign)
DTSTART:20200519T160000Z
DTEND:20200519T173000Z
DTSTAMP:20260604T221846Z
UID:IASML/1
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /1/">Neural SDEs: deep generative models in the diffusion limit</a>\nby Ma
 xim Raginsky (University of Illinois Urbana-Champaign) as part of IAS Semi
 nar Series on Theoretical Machine Learning\n\n\nAbstract\nIn deep generati
 ve models\, the latent variable is generated by a time-inhomogeneous Marko
 v chain\, where at each time step we pass the current state through a para
 metric nonlinear map\, such as a feedforward neural net\, and add a small 
 independent Gaussian perturbation. In this talk\, based on joint work with
  Belinda Tzen\, I will discuss the diffusion limit of such models\, where 
 we increase the number of layers while sending the step size and the noise
  variance to zero. I will first provide a unified viewpoint on both sampli
 ng and variational inference in such generative models through the lens of
  stochastic control. Then I will show how we can quantify the expressivene
 ss of diffusion-based generative models. Specifically\, I will prove that 
 one can efficiently sample from a wide class of terminal target distributi
 ons by choosing the drift of the latent diffusion from the class of multil
 ayer feedforward neural nets\, with the accuracy of sampling measured by t
 he Kullback-Leibler divergence to the target distribution. Finally\, I wil
 l briefly discuss a scheme for unbiased\, finite-variance simulation in su
 ch models. This scheme can be implemented as a deep generative model with 
 a random number of layers.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Roni Rosenfeld (Carnegie Mellon University)
DTSTART:20200521T190000Z
DTEND:20200521T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/2
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /2/">Forecasting epidemics and pandemics</a>\nby Roni Rosenfeld (Carnegie 
 Mellon University) as part of IAS Seminar Series on Theoretical Machine Le
 arning\n\n\nAbstract\nEpidemiological forecasting is critically needed for
  decision making by national and local governments\, public health officia
 ls\, healthcare institutions and the general public. The Delphi group at C
 arnegie Mellon University was founded in 2012 to advance the theory and te
 chnological capability of epidemiological forecasting\, and to promote its
  role in decision making\, both public and private. Our long term vision i
 s to make epidemiological forecasting as useful and universally accepted a
 s weather forecasting is today. I will describe some of the methods we dev
 eloped over the past eight year for forecasting flu\, dengue and other epi
 demics\, and the challenges we faced in adapting these method to the COVID
  pandemic in the past few months.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Aleksander Madry (MIT)
DTSTART:20200609T162000Z
DTEND:20200609T175000Z
DTSTAMP:20260604T221846Z
UID:IASML/4
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /4/">What do our models learn?</a>\nby Aleksander Madry (MIT) as part of I
 AS Seminar Series on Theoretical Machine Learning\n\n\nAbstract\nLarge-sca
 le vision benchmarks have driven---and often even defined---progress in ma
 chine learning. However\, these benchmarks are merely proxies for the real
 -world tasks we actually care about. How well do our benchmarks capture su
 ch tasks?\n\nIn this talk\, I will discuss the alignment between our bench
 mark-driven ML paradigm and the real-world uses cases that motivate it. Fi
 rst\, we will explore examples of biases in the ImageNet dataset\, and how
  state-of-the-art models exploit them. We will then demonstrate how these 
 biases arise as a result of design choices in the data collection and cura
 tion processes.\n\nBased on joint works with Logan Engstrom\, Andrew Ilyas
 \, Shibani Santurkar\, Jacob Steinhardt\, Dimitris Tsipras and Kai Xiao.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Michael I. Jordan (UC Berkeley)
DTSTART:20200611T190000Z
DTEND:20200611T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/5
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /5/">On Langevin Dynamics in Machine Learning</a>\nby Michael I. Jordan (U
 C Berkeley) as part of IAS Seminar Series on Theoretical Machine Learning\
 n\n\nAbstract\nLangevin diffusions are continuous-time stochastic processe
 s that are based on the gradient of a potential function. As such they hav
 e many connections---some known and many still to be explored---to gradien
 t-based machine learning. I'll discuss several recent results in this vein
 : (1) the use of Langevin-based algorithms in bandit problems\; (2) the ac
 celeration of Langevin diffusions\; (3) how to use Langevin Monte Carlo wi
 thout making smoothness assumptions. I'll present these results in the con
 text of a general argument about the virtues of continuous-time perspectiv
 es in the analysis of discrete-time optimization and Monte Carlo algorithm
 s.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Avrim Blum (Toyota Technological Institute at Chicago)
DTSTART:20200616T190000Z
DTEND:20200616T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/6
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /6/">On learning in the presence of biased data and strategic behavior</a>
 \nby Avrim Blum (Toyota Technological Institute at Chicago) as part of IAS
  Seminar Series on Theoretical Machine Learning\n\n\nAbstract\nIn this tal
 k I will discuss two lines of work involving learning in the presence of b
 iased data and strategic behavior.  In the first\, we ask whether fairness
  constraints on learning algorithms can actually improve the accuracy of t
 he classifier produced\, when training data is unrepresentative or corrupt
 ed due to bias.  Typically\, fairness constraints are analyzed as a tradeo
 ff with classical objectives such as accuracy.  Our results here show ther
 e are natural scenarios where they can be a win-win\, helping to improve o
 verall accuracy.  In the second line of work we consider strategic classif
 ication: settings where the entities being measured and classified wish to
  be classified as positive (e.g.\, college admissions) and will try to mod
 ify their observable features if possible to make that happen.  We conside
 r this in the online setting where a particular challenge is that updates 
 made by the learning algorithm will change how the inputs behave as well.\
 n
LOCATION:https://stable.researchseminars.org/talk/IASML/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Csaba Szepesvári (University of Alberta)
DTSTART:20200618T190000Z
DTEND:20200618T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/7
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /7/">The challenges of model-based reinforcement learning and how to overc
 ome them</a>\nby Csaba Szepesvári (University of Alberta) as part of IAS 
 Seminar Series on Theoretical Machine Learning\n\n\nAbstract\nSome believe
  that truly effective and efficient reinforcement learning algorithms must
  explicitly construct and explicitly reason with models that capture the c
 ausal structure of the world. In short\, model-based reinforcement learnin
 g is not optional. As this is not a new belief\, it may be surprising that
  empirically\, at least as far as the current state of art is concerned\, 
 the majority of the top performing algorithms are model-free. In this talk
 \, I will define three major challenges that need to be overcome for model
 -based methods to take their place above\, or before the model-free ones: 
 (1) planning with large models\; (2) models are never well-specified\; (3)
  models need to focus on task relevant aspects and ignore others. For each
  of the challenges\, I will describe recent results that address them and 
 I will also take a tally of the most interesting (and challenging) remaini
 ng open problems.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sanjeev Arora (Princeton University and IAS)
DTSTART:20200625T190000Z
DTEND:20200625T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/8
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /8/">Instance-Hiding Schemes for Private Distributed Learning</a>\nby Sanj
 eev Arora (Princeton University and IAS) as part of IAS Seminar Series on 
 Theoretical Machine Learning\n\n\nAbstract\nAn important problem today is 
 how to allow multiple distributed entities to train a shared neural networ
 k on their private data while protecting data privacy. Federated learning 
 is a standard framework for distributed deep learning Federated Learning\,
  and one would like to assure full privacy in that framework . The propose
 d methods\, such as homomorphic encryption and differential privacy\, come
  with drawbacks such as large computational overhead or large drop in accu
 racy. This work introduces a new and simple encryption of training data\, 
 which hides the information in it and allows its use in the usual deep lea
 rning pipeline. The encryption is inspired by classic notion of instance-h
 iding in cryptography. Experiments show that it allows training with fairl
 y small effect on final accuracy.\n\nWe also give some theoretical analysi
 s of privacy guarantees for this encryption\, showing that violating priva
 cy requires attackers to solve a difficult computational problem.\n\nJoint
  work with Yangsibo Huang\, Zhao Song\, and Kai Li. To appear at ICML 2020
 .\n
LOCATION:https://stable.researchseminars.org/talk/IASML/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jennifer Listgarten (UC Berkeley)
DTSTART:20200707T163000Z
DTEND:20200707T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/9
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /9/">Machine learning-based design (of proteins\, small molecules and beyo
 nd)</a>\nby Jennifer Listgarten (UC Berkeley) as part of IAS Seminar Serie
 s on Theoretical Machine Learning\n\n\nAbstract\nData-driven design is mak
 ing headway into a number of application areas\, including protein\, small
 -molecule\, and materials engineering. The design goal is to construct an 
 object with desired properties\, such as a protein that binds to a target 
 more tightly than previously observed. To that end\, costly experimental m
 easurements are being replaced with calls to a high-capacity regression mo
 del trained on labeled data\, which can be leveraged in an in silico searc
 h for promising design candidates. The aim then is to discover designs tha
 t are better than the best design in the observed data. This goal puts mac
 hine-learning based design in a much more difficult spot than traditional 
 applications of predictive modelling\, since successful design requires\, 
 by definition\, some degree of extrapolation---a pushing of the predictive
  models to its unknown limits\, in parts of the design space that are a pr
 iori unknown. In this talk\, I will anchor this overall problem in protein
  engineering\, and discuss our emerging approaches to tackle it.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Anima Anandkumar (Caltech)
DTSTART:20200709T190000Z
DTEND:20200709T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/10
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /10/">Role of Interaction in Competitive Optimization</a>\nby Anima Anandk
 umar (Caltech) as part of IAS Seminar Series on Theoretical Machine Learni
 ng\n\n\nAbstract\nCompetitive optimization is needed for many ML problems 
 such as training GANs\, robust reinforcement learning\, and adversarial le
 arning. Standard approaches to competitive optimization involve each agent
  independently optimizing their objective functions using SGD or other gra
 dient-based approaches. However\, they suffer from oscillations and instab
 ility\, since the optimization does not account for interaction among the 
 players. We introduce competitive gradient descent (CGD) that explicitly i
 ncorporates interaction by solving for Nash equilibrium of a local game. W
 e extend CGD to competitive mirror descent (CMD) for solving conically con
 strained competitive problems by using the dual geometry induced by a Breg
 man divergence.\n\nWe demonstrate the effectiveness of our approach for tr
 aining GANs and solving constrained reinforcement learning (RL) problems. 
 We also derive a competitive policy optimization method to train RL agents
  in competitive games. Finally\, we provide a novel perspective on trainin
 g GANs by pointing out the "GAN-dilemma" a fundamental flaw of the diverge
 nce-minimization perspective on GANs. Instead\, we argue that an implicit 
 competitive regularization due to simultaneous training methods\, such as 
 CGD\, is a crucial mechanism behind GAN performance.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Max Welling (University of Amsterdam)
DTSTART:20200721T163000Z
DTEND:20200721T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/11
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /11/">Graph Nets: The Next Generation</a>\nby Max Welling (University of A
 msterdam) as part of IAS Seminar Series on Theoretical Machine Learning\n\
 n\nAbstract\nIn this talk I will introduce our next generation of graph ne
 ural networks. GNNs have the property that they are invariant to permutati
 ons of the nodes in the graph and to rotations of the graph as a whole. We
  claim this is unnecessarily restrictive and in this talk we will explore 
 extensions of these GNNs to more flexible equivariant constructions. In pa
 rticular\, Natural Graph Networks for general graphs are globally equivari
 ant under permutations of the nodes but can still be executed through loca
 l message passing protocols. Our mesh-CNNs on manifolds are equivariant un
 der SO(2) gauge transformations and as such\, unlike regular GNNs\, entert
 ain non-isotropic kernels. And finally our SE(3)-transformers are local me
 ssage passing GNNs\, invariant to permutations but equivariant to global S
 E(3) transformations. These developments clearly emphasize the importance 
 of geometry and symmetries as design principles for graph (or other) neura
 l networks.\n\nJoint with: Pim de Haan and Taco Cohen (Natural Graph Netwo
 rks) Pim de Haan\, Maurice Weiler and Taco Cohen (Mesh-CNNs) Fabian Fuchs 
 and Daniel Worrall (SE(3)-Transformers)\n
LOCATION:https://stable.researchseminars.org/talk/IASML/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yoshua Bengio (Université de Montréal)
DTSTART:20200723T190000Z
DTEND:20200723T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/12
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /12/">Priors for Semantic Variables</a>\nby Yoshua Bengio (Université de 
 Montréal) as part of IAS Seminar Series on Theoretical Machine Learning\n
 \n\nAbstract\nSome of the aspects of the world around us are captured in n
 atural language and refer to semantic high-level variables\, which often h
 ave a causal role (referring to agents\, objects\, and actions or intentio
 ns). These high-level variables also seem to satisfy very peculiar charact
 eristics which low-level data (like images or sounds) do not share\, and i
 t would be good to clarify these characteristics in the form of priors whi
 ch can guide the design of machine learning systems benefitting from these
  assumptions. Since these priors are not just about the joint distribution
  between the semantic variables (e.g. it has a sparse factor graph corresp
 onding to a modular decomposition of knowledge) but also about how the dis
 tribution changes (typically by causal interventions)\, this analysis may 
 also help to build machine learning systems which can generalize better ou
 t-of-distribution. Introducing such assumptions is necessary to even start
  having a theory about generalizing out-of-distribution. There are also fa
 scinating connections between these priors and what is hypothesized about 
 conscious processing in the brain\, with conscious processing allowing us 
 to reason (i.e.\, perform chains of inferences about the past and the futu
 re\, as well as credit assignment) at the level of these high-level variab
 les. This involves attention mechanisms and short-term memory to form a bo
 ttleneck of information being broadcast around the brain between different
  parts of it\, as we focus on different high-level variables and some of t
 heir interactions. The presentation summarizes a few recent results using 
 some of these ideas for discovering causal structure and modularizing recu
 rrent neural networks with attention mechanisms in order to obtain better 
 out-of-distribution generalization and move deep learning towards capturin
 g some of the functions associated with conscious processing over high-lev
 el semantic variables.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jeffrey Negrea (University of Toronto)
DTSTART:20200714T163000Z
DTEND:20200714T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/13
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /13/">Relaxing the I.I.D. assumption: Adaptive mnimax optimal sequential p
 rediction with expert advice</a>\nby Jeffrey Negrea (University of Toronto
 ) as part of IAS Seminar Series on Theoretical Machine Learning\n\n\nAbstr
 act\nWe consider sequential prediction with expert advice when the data ar
 e generated stochastically\, but the distributions generating the data may
  vary arbitrarily among some constraint set. We quantify relaxations of th
 e classical I.I.D. assumption in terms of possible constraint sets\, with 
 I.I.D. at one extreme\, and an adversarial mechanism at the other. The Hed
 ge algorithm\, long known to be minimax optimal for in the adversarial reg
 ime\, has recently been shown to also be minimax optimal in the I.I.D. set
 ting. We show that Hedge is sub-optimal between these extremes\, and prese
 nt a new algorithm that is adaptively minimax optimal with respect to our 
 relaxations of the I.I.D. assumption\, without knowledge of which setting 
 prevails.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Arthur Gretton (University College London)
DTSTART:20200728T163000Z
DTEND:20200728T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/14
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /14/">Generalized Energy-Based Models</a>\nby Arthur Gretton (University C
 ollege London) as part of IAS Seminar Series on Theoretical Machine Learni
 ng\n\n\nAbstract\nI will introduce Generalized Energy Based Models (GEBM) 
 for generative modelling. These models combine two trained components: a b
 ase distribution (generally an implicit model)\, which can learn the suppo
 rt of data with low intrinsic dimension in a high dimensional space\; and 
 an energy function\, to refine the probability mass on the learned support
 . Both the energy function and base jointly constitute the final model\, u
 nlike GANs\, which retain only the base distribution (the "generator"). In
  particular\, while the energy function is analogous to the GAN critic fun
 ction\, it is not discarded after training.\nGEBMs are trained by alternat
 ing between learning the energy and the base. Both training stages are wel
 l-defined: the energy is learned by maximising a generalized likelihood\, 
 and the resulting energy-based loss provides informative gradients for lea
 rning the base. Samples from the posterior on the latent space of the trai
 ned model can be obtained via MCMC\, thus finding regions in this space th
 at produce better quality samples. Empirically\, the GEBM samples on image
 -generation tasks are of much better quality than those from the learned g
 enerator alone\, indicating that all else being equal\, the GEBM will outp
 erform a GAN of the same complexity. GEBMs also return state-of-the-art pe
 rformance on density modelling tasks\, and when using base measures with a
 n explicit form.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Peter Stone (University of Texas at Austin)
DTSTART:20200730T190000Z
DTEND:20200730T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/15
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /15/">Efficient Robot Skill Learning via Grounded Simulation Learning\, Im
 itation Learning from Observation\, and Off-Policy Reinforcement Learning<
 /a>\nby Peter Stone (University of Texas at Austin) as part of IAS Seminar
  Series on Theoretical Machine Learning\n\n\nAbstract\nFor autonomous robo
 ts to operate in the open\, dynamically changing world\, they will need to
  be able to learn a robust set of skills from relatively little experience
 . This talk begins by introducing Grounded Simulation Learning as a way to
  bridge the so-called reality gap between simulators and the real world in
  order to enable transfer learning from simulation to a real robot. It the
 n introduces two new algorithms for imitation learning from observation th
 at enable a robot to mimic demonstrated skills from state-only trajectorie
 s\, without any knowledge of the actions selected by the demonstrator. Con
 nections to theoretical advances in off-policy reinforcement learning will
  be highlighted throughout.\n\nGrounded Simulation Learning has led to the
  fastest known stable walk on a widely used humanoid robot\, and imitation
  learning from observation opens the possibility of robots learning from t
 he vast trove of videos available online.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Aapo Hyvärinen (University of Helsinki)
DTSTART:20200804T163000Z
DTEND:20200804T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/16
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /16/">Nonlinear independent component analysis</a>\nby Aapo Hyvärinen (Un
 iversity of Helsinki) as part of IAS Seminar Series on Theoretical Machine
  Learning\n\n\nAbstract\nUnsupervised learning\, in particular learning ge
 neral nonlinear representations\, is one of the deepest problems in machin
 e learning. Estimating latent quantities in a generative model provides a 
 principled framework\, and has been successfully used in the linear case\,
  e.g. with independent component analysis (ICA) and sparse coding. However
 \, extending ICA to the nonlinear case has proven to be extremely difficul
 t: A straight-forward extension is unidentifiable\, i.e. it is not possibl
 e to recover those latent components that actually generated the data. Her
 e\, we show that this problem can be solved by using additional informatio
 n either in the form of temporal structure or an additional observed varia
 ble. We start by formulating two generative models in which the data is an
  arbitrary but invertible nonlinear transformation of time series (compone
 nts) which are statistically independent of each other. Drawing from the t
 heory of linear ICA\, we formulate two distinct classes of temporal struct
 ure of the components which enable identification\, i.e. recovery of the o
 riginal independent components. We further generalize the framework to the
  case where instead of temporal structure\, an additional "auxiliary" vari
 able is observed and used by means of conditioning (e.g. audio in addition
  to video). Our methods are closely related to "self-supervised" methods h
 euristically proposed in computer vision\, and also provide a theoretical 
 foundation for such methods in terms of estimating a latent-variable model
 . Likewise\, we show how variants of deep latent-variable models such as V
 AE's can be seen as nonlinear ICA\, and made identifiable by suitable cond
 itioning.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Eric Xing (Carnegie Mellon University)
DTSTART:20200806T190000Z
DTEND:20200806T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/17
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /17/">A Blueprint of Standardized and Composable Machine Learning</a>\nby 
 Eric Xing (Carnegie Mellon University) as part of IAS Seminar Series on Th
 eoretical Machine Learning\n\n\nAbstract\nIn handling wide range of experi
 ences ranging from data instances\, knowledge\, constraints\, to rewards\,
  adversaries\, and lifelong interplay in an ever-growing spectrum of tasks
 \, contemporary ML/AI research has resulted in thousands of models\, learn
 ing paradigms\, optimization algorithms\, not mentioning countless approxi
 mation heuristics\, tuning tricks\, and black-box oracles\, plus combinati
 ons of all above. While pushing the field forward rapidly\, these results 
 also make a comprehensive grasp of existing ML techniques more and more di
 fficult\, and make standardized\, reusable\, repeatable\, reliable\, and e
 xplainable practice and further development of ML/AI products quite costly
 \, if possible\, at all. In this talk\, we present a simple and systematic
  blueprint of ML\, from the aspects of losses\, optimization solvers\, and
  model architectures\, that provides a unified mathematical formulation fo
 r learning with all experiences and tasks. The blueprint offers a holistic
  understanding of the diverse ML algorithms\, guidance of operationalizing
  ML for creating problem solutions in a composable and mechanic manner\, a
 nd unified framework for theoretical analysis.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:John Shawe-Taylor (University College London)
DTSTART:20200811T163000Z
DTEND:20200811T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/18
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /18/">Statistical Learning Theory for Modern Machine Learning</a>\nby John
  Shawe-Taylor (University College London) as part of IAS Seminar Series on
  Theoretical Machine Learning\n\n\nAbstract\nProbably Approximately Correc
 t (PAC) learning has attempted to analyse the generalisation of learning s
 ystems within the statistical learning framework. It has been referred to 
 as a ‘worst case’ analysis\, but the tools have been extended to analy
 se cases where benign distributions mean we can still generalise even if w
 orst case bounds suggest we cannot. The talk will cover the PAC-Bayes appr
 oach to analysing generalisation that is inspired by Bayesian inference\, 
 but leads to a different role for the prior and posterior distributions. W
 e will discuss its application to Support Vector Machines and Deep Neural 
 Networks\, including the use of distribution defined priors.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:John Langford (Microsoft Research)
DTSTART:20200813T190000Z
DTEND:20200813T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/19
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /19/">Latent State Discovery in Reinforcement Learning</a>\nby John Langfo
 rd (Microsoft Research) as part of IAS Seminar Series on Theoretical Machi
 ne Learning\n\n\nAbstract\nThere are three core orthogonal problems in rei
 nforcement learning: (1) Crediting actions (2) generalizing across rich ob
 servations (3) Exploring to discover the information necessary for learnin
 g.  Good solutions to pairs of these problems are fairly well known at thi
 s point\, but solutions for all three are just now being discovered.   I
 ’ll discuss several such results and dive into details on a few of them.
 \n
LOCATION:https://stable.researchseminars.org/talk/IASML/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Li Deng (Citadel)
DTSTART:20200818T163000Z
DTEND:20200818T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/20
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /20/">From Speech AI to Finance AI and Back</a>\nby Li Deng (Citadel) as p
 art of IAS Seminar Series on Theoretical Machine Learning\n\n\nAbstract\nA
  brief review will be provided first on how deep learning has disrupted sp
 eech recognition and language processing industries since 2009. Then conne
 ctions will be drawn between the techniques (deep learning or otherwise) f
 or modeling speech and language and those for financial markets. Similarit
 ies and differences of these two fields will be explored. In particular\, 
 three unique technical challenges to financial investment are addressed: e
 xtremely low signal-to-noise ratio\, extremely strong nonstationarity (wit
 h adversarial nature)\, and heterogeneous big data. Finally\, how the pote
 ntial solutions to these challenges can come back to benefit and further a
 dvance speech recognition and language processing technology will be discu
 ssed.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jason Eisner (Johns Hopkins University)
DTSTART:20200820T190000Z
DTEND:20200820T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/21
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /21/">Event Sequence Modeling with the Neural Hawkes Process</a>\nby Jason
  Eisner (Johns Hopkins University) as part of IAS Seminar Series on Theore
 tical Machine Learning\n\n\nAbstract\nSuppose you are monitoring discrete 
 events in real time.  Can you predict what events will happen in the futur
 e\, and when?  Can you fill in past events that you may have missed?  A pr
 obability model that supports such reasoning is the neural Hawkes process 
 (NHP)\, in which the Poisson intensities of K event types at time t depend
  on the history of past events.  This autoregressive architecture can capt
 ure complex dependencies.  It resembles an LSTM language model over K word
  types\, but allows the LSTM state to evolve in continuous time.  \n\nThis
  talk will present the NHP model along with methods for estimating paramet
 ers (MLE and NCE)\, sampling predictions of the future (thinning)\, and im
 puting missing events (particle smoothing).  I'll then show how to scale t
 he NHP or the LSTM language model to large K\, beginning with a temporal d
 eductive database for a real-world domain\, which can track how possible e
 vent types and other facts change over time.  We take the system state to 
 be a collection of vector-space embeddings of these facts\, and derive a d
 eep recurrent architecture from the temporal Datalog program that specifie
 s the database.  We call this method "neural Datalog through time."\n\nThi
 s work was done with Hongyuan Mei and other collaborators including Guangh
 ui Qin\, Minjie Xu\, and Tom Wan.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Piotr Indyk (Massachusetts Institute of Technology)
DTSTART:20200825T163000Z
DTEND:20200825T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/22
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /22/">Learning-Based Sketching Algorithms</a>\nby Piotr Indyk (Massachuset
 ts Institute of Technology) as part of IAS Seminar Series on Theoretical M
 achine Learning\n\n\nAbstract\nClassical algorithms typically provide "one
  size fits all" performance\, and do not leverage properties or patterns i
 n their inputs. A recent line of work aims to address this issue by develo
 ping algorithms that use machine learning predictions to improve their per
 formance. In this talk I will present two examples of this type\, in the c
 ontext of streaming and sketching algorithms. In particular\, I will show 
 how to use machine learning predictions to improve the performance of (a) 
 low-memory streaming algorithms for frequency estimation\, and (b) generat
 ing space partitions for nearest neighbor search.\n\nThe talk will cover m
 aterial from papers co-authored with Y Dong\, CY Hsu\, D Katabi\, I Razens
 hteyn\, T Wagner and A Vakilian.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Inderjit Dhillon (University of Texas at Austin)
DTSTART:20200827T190000Z
DTEND:20200827T203000Z
DTSTAMP:20260604T221846Z
UID:IASML/23
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /23/">Multi-Output Prediction: Theory and Practice</a>\nby Inderjit Dhillo
 n (University of Texas at Austin) as part of IAS Seminar Series on Theoret
 ical Machine Learning\n\n\nAbstract\nMany challenging problems in modern a
 pplications amount to finding relevant results from an enormous output spa
 ce of potential candidates\, for example\, finding the best matching produ
 ct from a large catalog or suggesting related search phrases on a search e
 ngine. The size of the output space for these problems can be in the milli
 ons to billions. Moreover\, observational or training data is often limite
 d for many of the so-called “long-tail” of items in the output space. 
 Given the inherent paucity of training data for most of the items in the o
 utput space\, developing machine learned models that perform well for spac
 es of this size is challenging. Fortunately\, items in the output space ar
 e often correlated thereby presenting an opportunity to alleviate the data
  sparsity issue. In this talk\, I will first discuss the challenges in mod
 ern multi-output prediction\, including missing values\, features associat
 ed with outputs\, absence of explicit negative examples\, and the need to 
 scale up to enormous data sets. Bilinear methods\, such as Inductive Matri
 x Completion (IMC)\, enable us to handle missing values and output feature
 s in practice\, while coming with theoretical guarantees. Nonlinear method
 s such as nonlinear IMC and DSSM (Deep Semantic Similarity Model) enable m
 ore powerful models that are used in practice in real-life applications. H
 owever\, inference in these models scales linearly with the size of the ou
 tput space. In order to scale up\, I will present the Prediction for Enorm
 ous and Correlated Output Spaces (PECOS) framework\, that performs predict
 ion in three phases: (i) in the first phase\, the output space is organize
 d using a semantic indexing scheme\, (ii) in the second phase\, the indexi
 ng is used to narrow down the output space by orders of magnitude using a 
 machine learned matching scheme\, and (iii) in the third phase\, the match
 ed items are ranked by a final ranking scheme. The versatility and modular
 ity of PECOS allows for easy plug-and-play of various choices for the inde
 xing\, matching\, and ranking phases\, and it is possible to ensemble vari
 ous models\, each arising from a particular choice for the three phases.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/23/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Soheil Feizi (University of Maryland College Park)
DTSTART:20200623T163000Z
DTEND:20200623T174500Z
DTSTAMP:20260604T221846Z
UID:IASML/24
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/IASML
 /24/">Generalizable Adversarial Robustness to Unforeseen Attacks</a>\nby S
 oheil Feizi (University of Maryland College Park) as part of IAS Seminar S
 eries on Theoretical Machine Learning\n\n\nAbstract\nIn the last couple of
  years\, a lot of progress has been made to enhance robustness of models a
 gainst adversarial attacks. However\, two major shortcomings still remain:
  (i) practical defenses are often vulnerable against strong “adaptive”
  attack algorithms\, and (ii) current defenses have poor generalization to
  “unforeseen” attack threat models (the ones not used in training).\n\
 nIn this talk\, I will present our recent results to tackle these issues. 
 I will first discuss generalizability of a class of provable defenses base
 d on randomized smoothing to various Lp and non-Lp attack models. Then\, I
  will present adversarial attacks and defenses for a novel “perceptual
 ” adversarial threat model. Remarkably\, the defense against perceptual 
 threat model generalizes well against many types of unforeseen Lp and non-
 Lp adversarial attacks.\n\nThis talk is based on joint works with Alex Lev
 ine\, Sahil Singla\, Cassidy Laidlaw\, Aounon Kumar and Tom Goldstein.\n
LOCATION:https://stable.researchseminars.org/talk/IASML/24/
END:VEVENT
END:VCALENDAR