Mathematics and Deep Learning (MDL)
MDL Collective Spring 2024 @ISU
MDL collective meets weekly in Carver 305 from 1-4pm for a combination of seminars and research discussions; raising issues and exchanging ideas on topics of current interest in the area of deep learning and relevant mathematical advances.
- 3/20/2024, 2:00-2:50pm Zoom link: https://iastate.zoom.us/j/96348125718 and password: 223039
Title: Deep JKO: time-implicit particle methods for general nonlinear gradient flows
Wonjun Lee, University of Minnesota
Abstract: We develop novel neural network-based implicit particle methods to compute high-dimensional Wasserstein-type gradient flows with linear and nonlinear mobility functions. The main idea is to use the Lagrangian formulation in the Jordan--Kinderlehrer--Otto (JKO) framework, where the velocity field is approximated using a neural network. We leverage the formulations from the neural ordinary differential equation (neural ODE) in the context of continuous normalizing flow for efficient density computation. Additionally, we make use of an explicit recurrence relation for computing derivatives, which greatly streamlines the backpropagation process. Our methodology demonstrates versatility in handling a wide range of gradient flows, accommodating various potential functions and nonlinear mobility scenarios. Extensive experiments demonstrate the efficacy of our approach, including an illustrative example from Bayesian inverse problems. This underscores that our scheme provides a viable alternative solver for the Kalman-Wasserstein gradient flow.
- 3/18/2024, 2:15--3:05pm [CAM Seminar] Zoom link: https://iastate.zoom.us/join and enter meeting ID: 924 6308 2326 and password: 123456
Title: Diffusion Models: Theory and Applications (in PDEs)
Yulong Lu, University of Minnesota
Abstract: Diffusion models, particularly score-based generative models (SGMs), have emerged as powerful tools in diverse machine learning applications, spanning from computer vision to modern language processing. In the first part of this talk, we delve into the generalization theory of SGMs, exploring their capacity for learning high-dimensional distributions. Our analysis establishes a groundbreaking result: SGMs achieve a dimension-free generation error bound when applied to a class of sub-Gaussian distributions characterized by low-complexity structures. This theoretical underpinning sheds light on the robust capabilities of SGMs in learning and sampling complex distributions.
List of topics for Spring 2024:
01/24 Monte Carlo Methods and Sapling techniques
01/31 The Rademacher Complexity and Generalization errors
02/07 University Approximator (classical works by CyBenko (1989))
02/14 Approximation bounds I: Barron's 1993 paper
02/21 Approximation bounds II: on the Barron space and related properties
02/28 Approximation bounds III: on network approximations for Sobolev functions
03/06 Generative models: Variational Auto-Encoder
03/13 Spring break
03/20 MDL talk by Wonjun Lee on Deep JKO.
03/27 Diffusion models
04/03 Optimization for deep learning
04/10
04/17
04/24
05/01
05/08
***********************************************************************************************************************************************
Mathematics and Deep Learning (MDL)
MDL Collective Spring 2022 @ISU
In Spring 2022, MDL seminar will be joined with the CAM seminar, held on Mondays 4:10-5:00pm. MDL talks are highlighted here:
- February 21, 4:10-5pm Zoom link: https://iastate.zoom.us/j/92463082326
Title: Parametric Fokker-Planck equation and its Wasserstein error estimates
Haomin Zhou, Georgia Institute of Technology
Abstract: In this presentation, I will introduce a system of ODEs that are constructed by using generative neural networks to spatially approximate the Fokker-Planck equation. We call this system Parametric Fokker-Planck equation. We design a semi-implicit time discretized scheme to compute its solution by only using random samples in space. The resulting algorithm allows us to approximate the solution of Fokker-Planck equation in high dimensions. Furthermore, we provide error bounds, in terms of Wasserstein metric, for the semi-discrete and fully discrete approximations. This presentation is based on a recent joint work with Wuchen Li (South Carolina), Shu Liu (Math GT) and Hongyuan Zha (CUHK-Shenzhen).
February 28, 4:10-5pm Zoom link: https://iastate.zoom.us/j/92463082326
Title: Efficient natural gradient method for large-scale optimization problems
Levon Nurbekyan, UCLA
Abstract: We propose an efficient numerical method for computing natural gradient descent directions with respect to a generic metric in the state space. Our technique relies on representing the natural gradient direction as a solution to a standard least-squares problem. Hence, instead of calculating, storing, or inverting the information matrix directly, we apply efficient methods from numerical linear algebra to solve this least-squares problem. We treat both scenarios where the derivative of the state variable with respect to the parameter is either explicitly known or implicitly given through constraints. We apply the QR decomposition to solve the least-squares problem in the former case and utilize the adjoint-state method to compute the natural gradient descent direction in the latter case.
Mathematics and Deep Learning (MDL)
MDL Collective Fall 2021 @ISU
Meeting times: Friday(s), 4:00pm--5:15pm
Talk: 4:10 --5:00; Q&A: -- 5:15
Zoom link: https://iastate.zoom.us/j/7185113738
Email list: mdl20@iastate.edu
Contact: Hailiang Liu at hliu@iastate.edu
"MDL collective was awarded as a new faculty learning community by the Office of the Vice President for Research (OVPR) and the Center for Excellence in the Arts and Humanities (CEAH), Iowa State University -- Aug. 17, 2020"
The MDL (Mathematics and Deep Learning) collective meets weekly/biweekly for a combination of seminars, journal presentations, and research discussions; raising issues and exchanging ideas on topics of current interest in the area of deep learning and relevant mathematical advances. The format consists of a lead presentation of about 40 minutes, followed by questions and in-depth discussions.
Note: The MDL collective will be partially joint with the TrAC seminar series being planned via the "Translational AI Center For Research and Education (TrAC)"-- Sept. 2021.
October 1, Friday Zoom 4-5pm
"A neural network approach for high-dimensional real-time optimal control" slides
Levon Nurbepkyan
UCLA
Abstract: Due to fast calculation at the deployment, neural networks (NN) are attractive for real-time applications. I will present one possible approach for training NN to synthesize real-time controls. A key aspect of our method is the combination of the following features:
1. No data generation and fitting
2. Direct optimization of trajectories
3. The correct structural ansatz for the approximations of optimal control
With these techniques, we can solve problems in hundreds of dimensions. We also find some unexpected generalization properties. The talk is based on two recent papers https://arxiv.org/abs/2104.03270 and https://arxiv.org/abs/2011.04757.
Brief bio: I am currently an Assistant Adjunct Professor at the Department of Mathematics at UCLA. I obtained my Ph.D. in the framework of UT Austin -- Portugal CoLab under the supervision of Professors Diogo Gomes and Alessio Figalli. My research interests include calculus of variations, optimal control theory, mean-field games, partial differential equations, mathematics applied to machine learning, dynamical systems, and shape optimization problems. I have been a Senior Fellow at the Institute for Pure and Applied Mathematics (IPAM) at UCLA for its Spring 2020 Program on High Dimensional Hamilton-Jacobi PDEs and a Simons CRM Scholar at the Centre de Recherches Mathématiques (CRM) at the University of Montreal for its Spring 2019 Program on Data Assimilation: Theory, Algorithms, and Applications.
*****************************************************************************************************************************************************************************************
Mathematics and Deep Learning (MDL)
MDL Collective Spring 2021 @ISU [Spring semester: Jan 25--May 06, 2021]
Meeting times: Friday(s), 4:00pm--5:15pm
Talk: 4:10 --5:00; Q&A: -- 5:15
Zoom link: https://iastate.zoom.us/j/7185113738
Email list: mdl20@iastate.edu
Contact: Hailiang Liu at hliu@iastate.edu
ISU MDL_Collective chat room: https://isumdl20.slack.com/
The MDL (Mathematics and Deep Learning) collective meets weekly/biweekly for a combination of seminars, journal presentations, and research discussions; raising issues and exchanging ideas on topics of current interest in the area of deep learning and relevant mathematical advances. The format consists of a lead presentation of about 40 minutes, followed by questions and in-depth discussions.
**********************************************************************************************************************************************
Research topics
* Deep approximation theory
* Deep approximation of functionals, operators
* Deep learning for mean-field games
* Deep learning for inverse problems
* Deep learning and computations of high-dimensional PDEs
**********************************************************************************************************************************************
Feb. 18, Thursday 7:00-8:00pm
"Deep Approximation via Deep Learning" slides records
Zuowei Shen
National University of Singapore
Abstract: The primary task of many applications is approximating/estimating a function through samples drawn from a probability distribution on the input space. The deep approximation is to approximate a function by compositions of many layers of simple functions, that can be viewed as a series of nested feature extractors. The key idea of deep learning network is to convert layers of compositions to layers of tunable parameters that can be adjusted through a learning process, so that it achieves a good approximation with respect to the input data. In this talk, we shall discuss mathematical theory behind this new approach and approximation rate of deep network. We will also show how this new approach differs
from the classic approximation theory, and how this new theory can be used to understand and design deep learning network.
March 05. 4:00-5:15pm on Zoom
"Deep Learning and Neural Networks: The Mathematical View" slides records
Ronald DeVore
Texas A & M University
Abstract: Deep Learning is much publicized and has had great empirical success on challenging problems in learning. Yet there is no quantifiable proof of performance and certified guarantees for these methods. This talk will give an overview of Deep Learning from the viewpoint of mathematics and numerical computation.
Bio Info: The Walter E. Koss Professor and Distinguished Professor of Mathematics
see Homepage (tamu.edu)
March 12 -- George Karniadakis, Brown University records
Title: Approximating functions, functionals, and operators using deep neural networks for diverse applications
Abstract: We will present a new approach to develop a data-driven, learning-based framework for predicting outcomes of physical and biophysical systems and for discovering hidden physics from noisy data. We will introduce a deep learning approach based on neural networks (NNs) and generative adversarial networks (GANs). Unlike other approaches that rely on big data, here we “learn” from small data by exploiting the information provided by the physical conservation laws, which are used to obtain informative priors or regularize the neural networks. We will demonstrate the power of PINNs for several inverse problems, and we will demonstrate how we can use multi-fidelity modeling in monitoring ocean acidification levels in the Massachusetts Bay. We will also introduce new NNs that learn functionals and nonlinear operators from functions and corresponding responses for system identification. The universal approximation theorem of operators is suggestive of the potential of NNs in learning from scattered data any continuous operator or complex system. We first generalize the theorem to deep neural networks, and subsequently we apply it to design a new composite NN with small generalization error, the deep operator network (DeepONet), consisting of a NN for encoding the discrete input function space (branch net) and another NN for encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators, e.g., integrals, Laplace transforms and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations. More generally, DeepOnet can learn multiscale operators spanning across many scales and trained by diverse sources of data simultaneously.
Bio: Charles Pitts Robinson and John Palmer Barstow Professor of Applied Mathematics Homepage(brown.edu)
George Karniadakis is from Crete. He received his S.M. and Ph.D. from Massachusetts Institute of Technology (1984/87). He was appointed Lecturer in the Department of Mechanical Engineering at MIT and subsequently he joined the Center for Turbulence Research at Stanford / Nasa Ames. He joined Princeton University as Assistant Professor in the Department of Mechanical and Aerospace Engineering and as Associate Faculty in the Program of Applied and Computational Mathematics. He was a Visiting Professor at Caltech in 1993 in the Aeronautics Department and joined Brown University as Associate Professor of Applied Mathematics in the Center for Fluid Mechanics in 1994. After becoming a full professor in 1996, he continued to be a Visiting Professor and Senior Lecturer of Ocean/Mechanical Engineering at MIT. He is an AAAS Fellow (2018-), Fellow of the Society for Industrial and Applied Mathematics (SIAM, 2010-), Fellow of the American Physical Society (APS, 2004-), Fellow of the American Society of Mechanical Engineers (ASME, 2003-) and Associate Fellow of the American Institute of Aeronautics and Astronautics (AIAA, 2006-). He received the SIAM/ACM Prize on Computational Science & Engineering (2021), the Alexander von Humboldt award in 2017, the SIAM Ralf E Kleinman award (2015), the J. Tinsley Oden Medal (2013), and the CFD award (2007) by the US Association in Computational Mechanics. His h-index is 108 and he has been cited over 55,000 times.
March 19 -- Alex Lin, UCLA slides
Title: Alternating the Population and Control Neural Networks to Solve High-Dimensional Stochastic Mean-Field Games
Abstract: We present APAC-Net, an alternating population and agent control neural network for solving stochastic mean field games (MFGs).
Our algorithm is geared toward high-dimensional instances of MFGs that are beyond reach with existing solution methods. We achieve this in two
steps. First, we take advantage of the underlying variational primal-dual structure that MFGs exhibit and phrase it as a convex-concave saddle point problem. Second, we parameterize the value and density functions by two neural networks, respectively. By phrasing the problem in this manner, solving the MFG can be interpreted as a special case of training a generative adversarial network (GAN). We show the potential of our method on up to 100-dimensional MFG problems.
April 02 -- Xiongtao Dai, [ISU, Stat] slides records
Title: Exploratory Data Analysis for Data Objects on a Metric Space via Tukey's Depth
Abstract: Exploratory data analysis involves looking at the data and understanding what can be done with them. Non-standard data objects such
as directions, covariance matrices, trees, functions, and images have become increasingly common in modern practice. Such complex data objects
are hard to examine due to the lack of a natural ordering and efficient visualization tools. We develop a novel exploratory tool for data objects lying on a metric space based on data depth, extending the celebrated Tukey's depth for Euclidean data. The proposed metric halfspace depth assigns depth values to data points, characterizing the centrality and outlyingness of these points. This also leads to an interpretable center-outward ranking, which can be used to construct rank tests. I will demonstrate two applications, one to reveal differential brain connectivity patterns in an Alzheimer's disease
study, and another to infer the phylogenetic history and outlying phylogenies in 7 pathogenic parasites.
April 16 -- Lexing Ying, Stanford University slides records
Title: Solving Inverse Problems with Deep Learning
Abstract: This talk is about some recent progress on solving inverse problems using deep learning. Compared to traditional machine learning problems, inverse problems are often limited by the size of the training data set. We show how to overcome this issue by incorporating mathematical analysis and physics into the design of neural network architectures.
April 30 -- Mishra Siddhartha, ETH -- Switzerland
Title: Deep Learning and Computations of high-dimensional PDEs
Abstract: Partial Differential Equations (PDEs) with very high-dimensional state and/or parameter spaces arise in a wide variety of contexts ranging from computational chemistry and finance to many-query problems in various areas of science and engineering. In this talk, we will survey recent results on the use of deep neural networks in computing these high-dimensional PDEs. We will focus on two different aspects i.e., the use of supervised deep learning, in the form of both standard deep neural networks as well as recently proposed DeepOnets, for efficient approximation of many-query PDEs and the use of physics informed neural-networks (PINNs) for the computation of forward and inverse problems for PDEs with high-dimensional state spaces.
**************************************************************************************************************************************************************************
Mathematics and Deep Learning (MDL)
MDL Collective Fall 2020 @ISU
Meeting times: Friday(s), 4:00pm--5:00pm
News on Aug 17, 2020:
"MDL collective has been awarded as a new faculty learning community by the Office of the Vice President for Research (OVPR) and the Center for Excellence in the Arts and Humanities (CEAH), Iowa State University."
Presentation Schedule:
8/28 Organization meeting 4:00pm-5:00pm (meeting_minutes)
9/04 "The field of robust deep learning" by Soumik Sarkar (Mech/ENG) slides
9/11 "Controlling Propagation of epidemics via mean-field games" by Wuchen Li, Univ. South Carolina. slides
Abstract: The coronavirus disease 2019 (COVID-19) pandemic is changing and impacting lives on a global scale. In this paper, we introduce a mean-field game model in controlling the propagation of epidemics on a spatial domain. The control variable, the spatial velocity, is first introduced for the classical disease models, such as the SIR model. For this proposed model, we provide fast numerical algorithms based on proximal primal-dual methods. Numerical experiments demonstrate that the proposed model illustrates how to separate infected patients in a spatial domain effectively.
This is based on a joint work with Wonjun Lee, Siting Liu, Hamidou Tembine, Stanley Osher.
9/18 "A Geometric Understanding of Deep Learning" by Xianfeng David Gu, Stony Brook University slides
Abstract: This work introduces an optimal transportation (OT) view of generative adversarial networks (GANs). Natural datasets have intrinsic patterns, which can be summarized as the manifold distribution principle: the distribution of a class of data is close to a low-dimensional manifold. GANs mainly accomplish two tasks: manifold learning and probability distribution transformation. The latter can be carried out using the classical OT method. From the OT perspective, the generator computes the OT map, while the discriminator computes the Wasserstein distance between the generated data distribution and the real data distribution; both can be reduced to a convex geometric optimization process. Furthermore, OT theory discovers the intrinsic collaborative—instead of competitive—relation between the generator and the discriminator, and the fundamental reason for mode collapse. We also propose a novel generative model, which uses an autoencoder (AE) for manifold learning and OT map for probability distribution transformation. This AE–OT model improves the theoretical rigor and transparency, as well as the computational stability and efficiency; in particular, it eliminates the mode collapse. The experimental results validate our hypothesis, and demonstrate the advantages of our proposed model.
9/25 Literature review:
"Interpolation of distributions via optimal transport" by Manas Bhatnagar [Math, ISU] slides
Ref: Measure_valued spline curves: An optimal transport viewpoint by Chen et al 2018.
"A review of deep learning methods for solving forward/inverse partial differential equations" by Biswajit Khara [Mech Eng. ISU] slides
Refs:
(1) ANNS for PDEs, Lagaris et al 1998.
(2) Data-driven discovery of PDEs, Rudy et al, 2017.
(3) Physics-informed neural networks, Raissi et al, 2019.
(4) Encoding invariances in deep generative models, Shah et al, 2019.
10/2. "Adversarial Projections for Inverse Problems" by Samy Wu Fung, UCLA. slides
Abstract: We present a new mechanism, called adversarial projection, that projects a given signal onto the intrinsically low dimensional manifold of true data. This operation can be used for solving inverse problems, which consists of recovering a signal from a collection of noisy measurements. Rather than attempt to encode prior knowledge via an analytic regularizer, we leverage available data to project signals directly onto the (possibly nonlinear) manifold of true data (i.e., regularize via an indicator function of the manifold). Our approach avoids the difficult task of forming a direct representation of the manifold. Instead, we directly learn the projection operator by solving a sequence of unsupervised learning problems, and we prove our method converges in probability to the desired projection. This operator can then be directly incorporated into optimization algorithms in the same manner as Plug and Play methods, but now with added theoretical guarantees. Numerical examples are provided.
10/9 Professor Stanley Osher's online talk
"UCLA AFOSR MURI on Aspects of Mean Field Games"
.
10/15 "AEGD: Adaptive Gradient Decent with Energy" Hailiang Liu [ISU, Math] slides
Abstract: We propose AEGD, a new algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive updates of quadratic energy. As long as an objective function is bounded from below, AEGD can be applied, and it is shown to be unconditionally energy stable, irrespective of the step size. In addition, AEGD enjoys tight convergence rates, yet allows a large step size. The method is straightforward to implement and requires little tuning of hyper-parameters. Experimental results demonstrate that AEGD works well for various optimization problems: it is robust with respect to initial data, capable of making rapid initial progress, shows comparable and most times better generalization performance than SGD with momentum for deep neural networks.
Within this talk we will also review selection dynamics for deep neural networks, and report preliminary results on the optimal control of Covid-19.
10/23 "Physics aware machine learning" by Baskar Ganapathysubramanian [ISU, Mech/ENG] slides
Some recent work on incorporating physics-based constraints into ML models, with applications to solving PDEs. Joint work with Sarkar, Krishnamurthy and Hegde groups.
10/30 ``Non-convex Minmax Optimization for Machine Learning" by Songtao Lu, IBM; slides
Abstract: We live in an era of data explosion. The rapid advances in sensor, communication, and storage technologies have made data acquisition more ubiquitous than at any time in the past. Making sense of data of such a scale is expected to bring ground-breaking advances across many industries and disciplines. However, to effectively handle data of such scale and complexity, and to better extract information from quintillions of bytes of data for inference, learning and decision-making, increasingly complex mathematical models are needed. These models are often highly non-convex, unstructured, and can have millions or even billions of variables, making existing solution methods no longer applicable.
In this talk, I will present a few recent works that design accurate, scalable, and robust algorithms for solving non-convex machine learning problems. My focus will be given to discussing the theoretical and practical properties of a class of gradient-based algorithms for solving a popular family of non-convex min-max problems. I will also showcase the practical performance of these algorithms in applications such as resource allocation in wireless communications, adversarial learning, and distributed/decentralized training. Finally, I will briefly introduce the possible extension of our framework to other areas and the future works.
11/06 Literature review
``An overview of deep reinforcement learning methods, challenges and applications" by Xian Yeow Lee [Mech, ISU] slides
Refs:
1. Open AI Spinning Up: https://spinningup.openai.com/en/latest/
2. Human-level control through deep reinforcement learning, Nature, 518: 529--533, 2015. V. Mnih et al 2015
3. Policy Gradient Methods for Reinforcement Learning with Function Approximation, R. S. Sutton et al 2000
4. Deep Reinforcement Learning: Overview, Yuxi Li, 2019
5. Challenges of Real-World Reinforcement Learning, G. Dulac-Arnold et al 2019
11/20 ``Non-local regularization of Mean-field Dynamics for Probability Density Stabilization" by Karthik Elamvazhuthi [UCLA] slides
Abstract: In this talk, I will present some recent work on developing particle methods for numerically simulating a class of PDEs that provide strategies for probability density stabilization. These have important applications in problems such as sampling and multi-agent control. We will consider a class of nonlinear PDE models with the velocity and reaction parameters that are to be designed so that the solution of PDE converges to a target probability density. Since the parameters of these PDEs depend on the local density, they are not suitable for implementation on a finite number of particles. We construct a particle method by regularizing the local dependence to construct a non-local PDE. While the nonlocal approximations make numerical implementation easier, their local limits have good analytical properties from the point of view of understanding long-term
Research topics:
* Robust deep learning, Adversarial attacks
* Optimal transport, GANs, geometry learning
* Reinforcement learning
* Optimal control and prediction
* Applications
-- Prediction and control of Covid-19 pandemic
-- Data driven and Science informed discovery
-- Solving high dimensional partial differential equations
***************************************************************************************
MDL Collective Spring 2020 @ISU
Meeting times: Friday(s), 4:10pm--5:00pm
Location: Carver 0204
Email list: mdl20@iastate.edu
Contact: Hailiang Liu at hliu@iastate.edu
MDL Presentaitons:
1/24 Organization meeting
2/7 "Overview of Machine Learning Topics" by Jin Tian(CS/LAS) slides.
2/14 "Physics aware deep learning: overview and some open problems" by Baskar Ganapathysubramanian (Mech/ENG) slides.
2/27 " Beyond neural networks and dynamic systems" by Chenyu Xu (ECpE/ENG) slides.
3/13 "The field of robust deep learning" by Soumik Sarkar (Mech/ENG) (postponed)
4/17 "Introduction to Deep Double Descent" by Tianxiang Gao (CS/LAS) slides.
4/24 "The Neural Tangent Kernel, with Applications to Autoencoder Learning"(preprint) by Chinmay Hegde, NYU slides
5/01 "Partial Difrential Equation Principled Trustworthy Deep Learning " Bao Wang, UCLA slides
5/08 "Neural transport information computation: Mean field games" (paper) Wuchen Li, University of South Carolina /UCLA slides
SIAM Conference on Mathematics of Data Science (MDS20) Minitutorial talks on
5/19 "A Mathematical Perspective of Machine Learning" Weinan E, Princeton, 9:00-10:00am (CDT)
5/20 " ODE/PDE Neural Networks" Eldad Haber, 9:00-10:00am (CDT)
5/21 "Deep Neural Networks for High-Dimensional Parabolic PDEs" Christoph Ressinger, 9:00-10:00am (CDT)
Registraiton is at https://www.siam.org/conferences/cm/conference/mds20.
5/22 "APAC-Net for Solving High-Dimensional And Stochastic Mean-Field Games" Alex Lin, UCLA slides
6/17-18 Minisymposium at SIAM Conference on Mathematics of Data Science (MDS20)
"Deep Learning via Optimal Control in Data Space"
Organizer: Wuchen Li (UCLA) and Hailiang Liu (ISU)
Detailed schedule can be found at
"https://docs.google.com/spreadsheets/d/1xmlup_UovoSSmKe7ccclQfn2zAnATEb…"
Research papers by topcis:
* Expressibility/Representaiton
1. Beyond finite layer neural networks: bridging deep architectures and numerical dnerental equations
by Yiping Lu, et al ArXiv: 1710.10121(2017)
* Optimization
1. A Machine Learning Framework for Solving High-Dimensional Mean Field Game and Mean Field Control Problems
by Stanley Osher et al ArXiv: 1912.01825 (2019)
*Training dynamics
1. Global convergence of neuron birth-death dynamics
by Grant Rotsko et al ArXiv: 1902.01843 (2019)
*Generalization
1. DEEP DOUBLE DESCENT: WHERE BIGGER MODELS AND MORE DATA HURT
by Preetum Nakkiran et al at ArXiv: 1912.02292 (Dec. 2019)
2. Two models of double descent for weak features
by Mikhail Belkin et al at ArXiv: 1903.07571 (March 2019)
3. Surprises in High-Dimensional Ridgeless Least SquaresInterpolation
by Trevor Hastie et al at ArXiv:1903.08560 (Nov. 2019)
4. Reconciling modern machine learning practiceand the bias-variance trade-off
by Mikhail Belkin et al at ArXiv: 1812.11118 (Sept 2019)