MDL Collective

Mathematics and Deep Learning (MDL)
MDL Collective Spring 2024 @ISU 

MDL collective meets weekly in Carver 305 from 1-4pm for a combination of seminars and research discussions; raising issues and exchanging ideas on topics of current interest in the area of deep learning and relevant mathematical advances. 

 Title:   Deep JKO: time-implicit particle methods for general nonlinear gradient flows
 Wonjun Lee, University of Minnesota
Abstract: We develop novel neural network-based implicit particle methods to compute high-dimensional Wasserstein-type gradient flows with linear and nonlinear mobility functions. The main idea is to use the Lagrangian formulation in the Jordan--Kinderlehrer--Otto (JKO) framework, where the velocity field is approximated using a neural network. We leverage the formulations from the neural ordinary differential equation (neural ODE) in the context of continuous normalizing flow for efficient density computation. Additionally, we make use of an explicit recurrence relation for computing derivatives, which greatly streamlines the backpropagation process. Our methodology demonstrates versatility in handling a wide range of gradient flows, accommodating various potential functions and nonlinear mobility scenarios. Extensive experiments demonstrate the efficacy of our approach, including an illustrative example from Bayesian inverse problems. This underscores that our scheme provides a viable alternative solver for the Kalman-Wasserstein gradient flow.

       Title:  Diffusion Models: Theory and Applications (in PDEs)

       Yulong Lu, University of Minnesota

Abstract:   Diffusion models, particularly score-based generative models (SGMs), have emerged as powerful tools in diverse machine learning applications, spanning from computer vision to modern language processing. In the first part of this talk, we delve into the generalization theory of SGMs, exploring their capacity for learning high-dimensional distributions. Our analysis establishes a groundbreaking result: SGMs achieve a dimension-free generation error bound when applied to a class of sub-Gaussian distributions characterized by low-complexity structures. This theoretical underpinning sheds light on the robust capabilities of SGMs in learning and sampling complex distributions.

List of topics for Spring 2024: 

01/24  Monte Carlo Methods and Sapling techniques 

01/31 The Rademacher Complexity and Generalization errors 

02/07 University Approximator (classical works by CyBenko (1989)) 

02/14 Approximation bounds I: Barron's  1993 paper

02/21 Approximation bounds II: on the Barron space and related properties 

02/28 Approximation bounds III:  on network approximations for Sobolev functions

03/06 Generative models: Variational Auto-Encoder 

03/13 Spring break 

03/20 MDL talk by Wonjun Lee on Deep JKO. 

03/27 Diffusion models  

04/03 Optimization for deep learning  

















Mathematics and Deep Learning (MDL)
MDL Collective Spring 2022 @ISU  

In Spring 2022, MDL seminar will be joined with the CAM seminar, held on Mondays 4:10-5:00pm.  MDL talks are highlighted here:

       Title: Parametric Fokker-Planck equation and its Wasserstein error estimates      

       Haomin Zhou, Georgia Institute of Technology

Abstract:  In this presentation, I will introduce a system of ODEs that are constructed by using generative neural networks to spatially approximate the Fokker-Planck equation. We call this system Parametric Fokker-Planck equation. We design a semi-implicit time discretized scheme to compute its solution by only using random samples in space. The resulting algorithm allows us to approximate the solution of Fokker-Planck equation in high dimensions. Furthermore, we provide error bounds, in terms of Wasserstein metric, for the semi-discrete and fully discrete approximations. This presentation is based on a recent joint work with Wuchen Li (South Carolina), Shu Liu (Math GT) and Hongyuan Zha (CUHK-Shenzhen).

February 28, 4:10-5pm Zoom link: 

Title: Efficient natural gradient method for large-scale optimization problems

Levon Nurbekyan, UCLA 

Abstract: We propose an efficient numerical method for computing natural gradient descent directions with respect to a generic metric in the state space. Our technique relies on representing the natural gradient direction as a solution to a standard least-squares problem. Hence, instead of calculating, storing, or inverting the information matrix directly, we apply efficient methods from numerical linear algebra to solve this least-squares problem. We treat both scenarios where the derivative of the state variable with respect to the parameter is either explicitly known or implicitly given through constraints. We apply the QR decomposition to solve the least-squares problem in the former case and utilize the adjoint-state method to compute the natural gradient descent direction in the latter case.


Mathematics and Deep Learning (MDL)
MDL Collective Fall  2021 @ISU 
Meeting times: Friday(s), 4:00pm--5:15pm
Talk: 4:10 --5:00; Q&A: -- 5:15
Zoom link:
Email list:
Contact: Hailiang Liu at

"MDL collective was awarded as a new faculty learning community by the Office of the Vice President for Research (OVPR) and the Center for Excellence in the Arts and Humanities (CEAH), Iowa State University -- Aug. 17, 2020"

The MDL (Mathematics and Deep Learning) collective meets weekly/biweekly for a combination of seminars, journal presentations, and research discussions; raising issues and exchanging ideas on topics of current interest in the area of deep learning and relevant mathematical advances. The format consists of a lead presentation of about 40 minutes, followed by questions and in-depth discussions.

Note: The MDL collective will be partially joint with the TrAC seminar series being planned via the "Translational AI Center For Research and Education (TrAC)"-- Sept. 2021.     

October 1,  Friday  Zoom 4-5pm 

"A neural network approach for high-dimensional real-time optimal control"  slides 
 Levon Nurbepkyan

Abstract: Due to fast calculation at the deployment, neural networks (NN) are attractive for real-time applications. I will present one possible approach for training NN to synthesize real-time controls. A key aspect of our method is the combination of the following features:

 1.  No data generation and fitting
 2.  Direct optimization of trajectories
 3.  The correct structural ansatz for the approximations of optimal control

With these techniques, we can solve problems in hundreds of dimensions. We also find some unexpected generalization properties. The talk is based on two recent papers and

Brief bio: I am currently an Assistant Adjunct Professor at the Department of Mathematics at UCLA. I obtained my Ph.D. in the framework of UT Austin -- Portugal CoLab under the supervision of Professors Diogo Gomes and Alessio Figalli. My research interests include calculus of variations, optimal control theory, mean-field games, partial differential equations, mathematics applied to machine learning, dynamical systems, and shape optimization problems.  I have been a Senior Fellow at the Institute for Pure and Applied Mathematics (IPAM) at UCLA for its Spring 2020 Program on High Dimensional Hamilton-Jacobi PDEs and a Simons CRM Scholar at the Centre de Recherches Mathématiques (CRM) at the University of Montreal for its Spring 2019 Program on Data Assimilation: Theory, Algorithms, and Applications.





Mathematics and Deep Learning (MDL)
MDL Collective Spring  2021 @ISU [Spring semester: Jan 25--May 06, 2021]
Meeting times: Friday(s), 4:00pm--5:15pm
Talk: 4:10 --5:00; Q&A: -- 5:15
Zoom link:
Email list:
Contact: Hailiang Liu at
ISU MDL_Collective chat room:

The MDL (Mathematics and Deep Learning) collective meets weekly/biweekly for a combination of seminars, journal presentations, and research discussions; raising issues and exchanging ideas on topics of current interest in the area of deep learning and relevant mathematical advances. The format consists of a lead presentation of about 40 minutes, followed by questions and in-depth discussions.

Research  topics 
* Deep approximation theory
* Deep approximation of functionals, operators 
* Deep learning for mean-field games 
* Deep learning for inverse problems 
* Deep learning and computations of high-dimensional PDEs


Feb. 18, Thursday 7:00-8:00pm 

"Deep Approximation via Deep Learningslides   records
 Zuowei Shen
National University of Singapore

Abstract:  The primary task of many applications is approximating/estimating  a function  through  samples drawn from a probability distribution on the input space. The deep approximation  is to  approximate  a function by compositions of many layers of simple functions, that can be viewed as  a series of nested feature extractors. The  key idea of deep learning  network is to convert layers of compositions to  layers of tunable parameters that  can be adjusted through a  learning process,  so that it achieves a good approximation with respect to the input data.  In this talk, we  shall discuss mathematical theory behind  this new approach and approximation rate of deep network. We will also show  how  this new approach  differs
from  the classic approximation theory, and  how this new theory can be used to understand and design  deep learning network.

March 05. 4:00-5:15pm on Zoom   

"Deep Learning and Neural Networks: The Mathematical View"   slides     records 
Ronald DeVore
Texas A & M University 

Abstract: Deep Learning is much publicized and has had great empirical success on challenging  problems in learning.  Yet there is no quantifiable proof of performance and certified guarantees for these methods.  This talk will give an overview of Deep Learning from the viewpoint of mathematics and numerical computation. 

Bio Info: The Walter E. Koss Professor and Distinguished Professor of Mathematics 
see Homepage ( 

March 12 --  George Karniadakis, Brown University  records 

Title: Approximating functions, functionals, and operators using deep neural networks for diverse applications

Abstract: We will present a new approach to develop a data-driven, learning-based framework for predicting outcomes of physical and biophysical systems and for discovering hidden physics from noisy data. We will introduce a deep learning approach based on neural networks (NNs) and generative adversarial networks (GANs). Unlike other approaches that rely on big data, here we “learn” from small data by exploiting the information provided by the physical conservation laws, which are used to obtain informative priors or regularize the neural networks. We will demonstrate the power of PINNs for several inverse problems, and we will demonstrate how we can use multi-fidelity modeling in monitoring ocean acidification levels in the Massachusetts Bay. We will also introduce new NNs that learn functionals and nonlinear operators from functions and corresponding responses for system identification. The universal approximation theorem of operators is suggestive of the potential of NNs in learning from scattered data any continuous operator or complex system. We first generalize the theorem to deep neural networks, and subsequently we apply it to design a new composite NN with small generalization error, the deep operator network (DeepONet), consisting of a NN for encoding the discrete input function space (branch net) and another NN for encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators, e.g., integrals, Laplace transforms and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations. More generally, DeepOnet can learn multiscale operators spanning across many scales and trained by diverse sources of data simultaneously. 

Bio: Charles Pitts Robinson and John Palmer Barstow Professor of Applied Mathematics Homepage(

George Karniadakis is from Crete. He received his S.M. and Ph.D. from Massachusetts Institute of Technology (1984/87). He was appointed Lecturer in the Department of Mechanical Engineering at MIT and subsequently he joined the Center for Turbulence Research at Stanford / Nasa Ames. He joined Princeton University as Assistant Professor in the Department of Mechanical and Aerospace Engineering and as Associate Faculty in the Program of Applied and Computational Mathematics. He was a Visiting Professor at Caltech in 1993 in the Aeronautics Department and joined Brown University as Associate Professor of Applied Mathematics in the Center for Fluid Mechanics in 1994. After becoming a full professor in 1996, he continued to be a Visiting Professor and Senior Lecturer of Ocean/Mechanical Engineering at MIT. He is an AAAS Fellow (2018-), Fellow of the Society for Industrial and Applied Mathematics (SIAM, 2010-), Fellow of the American Physical Society (APS, 2004-), Fellow of the American Society of Mechanical Engineers (ASME, 2003-) and Associate Fellow of the American Institute of Aeronautics and Astronautics (AIAA, 2006-). He received the SIAM/ACM Prize on Computational Science & Engineering (2021), the Alexander von Humboldt award in 2017, the SIAM Ralf E Kleinman award (2015), the J. Tinsley Oden Medal (2013), and the CFD award (2007) by the US Association in Computational Mechanics. His h-index is 108 and he has been cited over 55,000 times.

March 19 -- Alex Lin, UCLA  slides

Title: Alternating the Population and Control Neural Networks to Solve High-Dimensional Stochastic Mean-Field Games

Abstract: We present APAC-Net, an alternating population and agent control neural network for solving stochastic mean field games (MFGs).
Our algorithm is geared toward high-dimensional instances of MFGs that are beyond reach with existing solution methods. We achieve this in two
steps. First, we take advantage of the underlying variational primal-dual structure that MFGs exhibit and phrase it as a convex-concave saddle point problem. Second, we parameterize the value and density functions by two neural networks, respectively. By phrasing the problem in this manner, solving the MFG can be interpreted as a special case of training a generative adversarial network (GAN). We show the potential of our method on up to 100-dimensional MFG problems.

April  02 -- Xiongtao Dai, [ISU, Stat]  slides  records 

Title: Exploratory Data Analysis for Data Objects on a Metric Space via Tukey's Depth 

Abstract: Exploratory data analysis involves looking at the data and understanding what can be done with them. Non-standard data objects such
as directions, covariance matrices, trees, functions, and images have become increasingly common in modern practice. Such complex data objects
are hard to examine due to the lack of a natural ordering and efficient visualization tools. We develop a novel exploratory tool for data objects lying on a metric space based on data depth, extending the celebrated Tukey's depth for Euclidean data. The proposed metric halfspace depth assigns depth values to data points, characterizing the centrality and outlyingness of these points. This also leads to an interpretable center-outward ranking, which can be used to construct rank tests. I will demonstrate two applications, one to reveal differential brain connectivity patterns in an Alzheimer's disease
study, and another to infer the phylogenetic history and outlying phylogenies in 7 pathogenic parasites.

April 16 -- Lexing Ying, Stanford University    slides   records 

Title: Solving Inverse Problems with Deep Learning

Abstract: This talk is about some recent progress on solving inverse problems using deep learning. Compared to traditional machine learning problems, inverse problems are often limited by the size of the training data set. We show how to overcome this issue by incorporating mathematical analysis and physics into the design of neural network architectures.

April 30 -- Mishra Siddhartha, ETH -- Switzerland 

Title: Deep Learning and Computations of high-dimensional PDEs

Abstract: Partial Differential Equations (PDEs) with very high-dimensional state and/or parameter spaces arise in a wide variety of contexts ranging from computational chemistry and finance to many-query problems in various areas of science and engineering. In this talk, we will survey recent results on the use of deep neural networks in computing these high-dimensional PDEs. We will focus on two different aspects i.e., the use of supervised deep learning, in the form of both standard deep neural networks as well as recently proposed DeepOnets, for efficient approximation of many-query PDEs and the use of physics informed neural-networks (PINNs) for the computation of forward and inverse problems for PDEs with high-dimensional state spaces.


Mathematics and Deep Learning (MDL)
MDL Collective Fall 2020 @ISU
Meeting times: Friday(s), 4:00pm--5:00pm

News on Aug 17, 2020:
"MDL collective has been awarded as a new faculty learning community by the Office of the Vice President for Research (OVPR) and the Center for Excellence in the Arts and Humanities (CEAH), Iowa State University."

Presentation Schedule:

8/28 Organization meeting 4:00pm-5:00pm (meeting_minutes)

9/04  "The field of robust deep learning" by Soumik Sarkar (Mech/ENG) slides

9/11 "Controlling Propagation of epidemics via mean-field games" by Wuchen Li, Univ. South Carolina.  slides
Abstract: The coronavirus disease 2019 (COVID-19) pandemic is changing and impacting lives on a global scale. In this paper, we introduce a mean-field game model in controlling the propagation of epidemics on a spatial domain. The control variable, the spatial velocity, is first introduced for the classical disease models, such as the SIR model. For this proposed model, we provide fast numerical algorithms based on proximal primal-dual methods. Numerical experiments demonstrate that the proposed model illustrates how to separate infected patients in a spatial domain effectively.
This is based on a joint work with Wonjun Lee, Siting Liu, Hamidou Tembine, Stanley Osher.

9/18 "A Geometric Understanding of Deep Learning"  by Xianfeng David Gu, Stony Brook University  slides

Abstract: This work introduces an optimal transportation (OT) view of generative adversarial networks (GANs). Natural datasets have intrinsic patterns, which can be summarized as the manifold distribution principle: the distribution of a class of data is close to a low-dimensional manifold. GANs mainly accomplish two tasks: manifold learning and probability distribution transformation. The latter can be carried out using the classical OT method. From the OT perspective, the generator computes the OT map, while the discriminator computes the Wasserstein distance between the generated data distribution and the real data distribution; both can be reduced to a convex geometric optimization process. Furthermore, OT theory discovers the intrinsic collaborative—instead of competitive—relation between the generator and the discriminator, and the fundamental reason for mode collapse. We also propose a novel generative model, which uses an autoencoder (AE) for manifold learning and OT map for probability distribution transformation. This AE–OT model improves the theoretical rigor and transparency, as well as the computational stability and efficiency; in particular, it eliminates the mode collapse. The experimental results validate our hypothesis, and demonstrate the advantages of our proposed model.
9/25 Literature review:

"Interpolation of distributions via optimal transport"  by Manas Bhatnagar [Math, ISU] slides
Ref: Measure_valued spline curves: An optimal transport viewpoint by Chen et al 2018.

"A review of deep learning methods for solving forward/inverse partial differential equations" by Biswajit Khara [Mech Eng. ISU] slides
(1) ANNS for PDEs, Lagaris et al 1998.
(2) Data-driven discovery of PDEs, Rudy et al, 2017.
(3) Physics-informed neural networks, Raissi et al, 2019.
(4) Encoding invariances in deep generative models, Shah et al, 2019.

10/2. "Adversarial Projections for Inverse Problems" by Samy Wu Fung, UCLA. slides
Abstract: We present a new mechanism, called adversarial projection, that projects a given signal onto the intrinsically low dimensional manifold of true data. This operation can be used for solving inverse problems, which consists of recovering a signal from a collection of noisy measurements.  Rather than attempt to encode prior knowledge via an analytic regularizer, we leverage available data to project signals directly onto the (possibly nonlinear) manifold of true data (i.e., regularize via an indicator function of the manifold). Our approach avoids the difficult task of forming a direct representation of the manifold. Instead, we directly learn the projection operator by solving a sequence of unsupervised learning problems, and we prove our method converges in probability to the desired projection. This operator can then be directly incorporated into optimization algorithms in the same manner as Plug and Play methods, but now with added theoretical guarantees. Numerical examples are provided.

10/9  Professor Stanley Osher's online talk
"UCLA AFOSR MURI on Aspects of Mean Field Games"

10/15 "AEGD: Adaptive Gradient Decent with Energy"  Hailiang Liu [ISU, Math] slides 

Abstract: We propose AEGD, a new algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive updates of quadratic energy. As long as an objective function is bounded from below, AEGD can be applied, and it is shown to be unconditionally energy stable, irrespective of the step size. In addition, AEGD enjoys tight convergence rates, yet allows a large step size. The method is straightforward to implement and requires little tuning of hyper-parameters. Experimental results demonstrate that AEGD works well for various optimization problems: it is robust with respect to initial data, capable of making rapid initial progress, shows comparable and most times better generalization performance than SGD with momentum for deep neural networks.

Within this talk we will also review selection dynamics for deep neural networks, and report preliminary results on the optimal control of Covid-19.

10/23  "Physics aware machine learning" by  Baskar Ganapathysubramanian [ISU, Mech/ENG]  slides
Some recent work on incorporating physics-based constraints into ML models, with applications to solving PDEs. Joint work with Sarkar, Krishnamurthy and Hegde groups. 


10/30 ``Non-convex Minmax Optimization for Machine Learning" by Songtao Lu, IBM; slides  

Abstract: We live in an era of data explosion. The rapid advances in sensor, communication, and storage technologies have made data acquisition more ubiquitous than at any time in the past. Making sense of data of such a scale is expected to bring ground-breaking advances across many industries and disciplines. However, to effectively handle data of such scale and complexity, and to better extract information from quintillions of bytes of data for inference, learning and decision-making, increasingly complex mathematical models are needed. These models are often highly non-convex, unstructured, and can have millions or even billions of variables, making existing solution methods no longer applicable.
In this talk, I will present a few recent works that design accurate, scalable, and robust algorithms for solving non-convex machine learning problems. My focus will be given to discussing the theoretical and practical properties of a class of gradient-based algorithms for solving a popular family of non-convex min-max problems. I will also showcase the practical performance of these algorithms in applications such as resource allocation in wireless communications, adversarial learning, and distributed/decentralized training. Finally, I will briefly introduce the possible extension of our framework to other areas and the future works.

11/06  Literature review

``An overview of deep reinforcement learning methods, challenges and applications" by Xian Yeow Lee [Mech, ISU]  slides

1. Open AI Spinning Up:
2. Human-level control through deep reinforcement learning, Nature, 518: 529--533, 2015. V. Mnih et al 2015
3. Policy Gradient Methods for Reinforcement Learning with Function Approximation, R. S. Sutton et al 2000
4. Deep Reinforcement Learning: Overview, Yuxi Li, 2019
5. Challenges of Real-World Reinforcement Learning,  G. Dulac-Arnold et al 2019

11/20  ``Non-local regularization of Mean-field Dynamics for Probability Density Stabilization" by Karthik Elamvazhuthi [UCLA] slides

Abstract: In this talk, I will present some recent work on developing particle methods for numerically simulating a class of PDEs that provide strategies for probability density stabilization. These have important applications in problems such as sampling and multi-agent control. We will consider a class of nonlinear PDE models with the velocity and reaction parameters that are to be designed so that the solution of PDE converges to a target probability density. Since the parameters of these PDEs depend on the local density, they are not suitable for implementation on a finite number of particles. We construct a particle method by regularizing the local dependence to construct a non-local PDE. While the nonlocal approximations make numerical implementation easier, their local limits have good analytical properties from the point of view of understanding long-term

Research  topics:
* Robust deep learning, Adversarial attacks  
* Optimal transport, GANs, geometry learning
* Reinforcement learning
* Optimal control and prediction
* Applications
-- Prediction and control of Covid-19 pandemic
-- Data driven and Science informed discovery 
-- Solving high dimensional partial differential equations


MDL Collective Spring 2020 @ISU
Meeting times: Friday(s), 4:10pm--5:00pm
Location: Carver 0204
Email list:
Contact: Hailiang Liu at

MDL Presentaitons:

1/24 Organization meeting

2/7 "Overview of Machine Learning Topics" by Jin Tian(CS/LAS) slides.

2/14 "Physics aware deep learning:  overview and some open problems" by Baskar Ganapathysubramanian (Mech/ENG) slides.

2/27 " Beyond neural networks and dynamic systems" by Chenyu Xu (ECpE/ENG) slides.

3/13 "The field of robust deep learning" by Soumik Sarkar (Mech/ENG) (postponed)

4/17 "Introduction to Deep Double Descent" by Tianxiang Gao (CS/LAS) slides.

4/24 "The Neural Tangent Kernel, with Applications to Autoencoder Learning"(preprint) by Chinmay Hegde, NYU slides

5/01 "Partial Difrential Equation Principled Trustworthy Deep Learning " Bao Wang, UCLA slides

5/08 "Neural transport information computation: Mean field games" (paper) Wuchen Li, University of South Carolina /UCLA slides

SIAM Conference on Mathematics of Data Science (MDS20) Minitutorial talks on

5/19 "A Mathematical Perspective of Machine Learning" Weinan E, Princeton, 9:00-10:00am (CDT)
5/20 " ODE/PDE Neural Networks" Eldad Haber, 9:00-10:00am (CDT)
5/21 "Deep Neural Networks for High-Dimensional Parabolic PDEs" Christoph Ressinger, 9:00-10:00am (CDT)

Registraiton is at

5/22 "APAC-Net for Solving High-Dimensional And Stochastic Mean-Field Games" Alex Lin, UCLA slides

6/17-18 Minisymposium at SIAM Conference on Mathematics of Data Science (MDS20)
"Deep Learning via Optimal Control in Data Space" 
Organizer: Wuchen Li (UCLA) and Hailiang Liu (ISU) 
Detailed schedule can be found at 

Research papers by topcis: 

* Expressibility/Representaiton 

1. Beyond finite layer neural networks: bridging deep architectures and numerical dnerental equations
by Yiping Lu, et al ArXiv: 1710.10121(2017)

* Optimization

1. A Machine Learning Framework for Solving High-Dimensional Mean Field Game and Mean Field Control Problems
by Stanley Osher et al ArXiv: 1912.01825 (2019)

*Training dynamics

1. Global convergence of neuron birth-death dynamics
by Grant Rotsko et al ArXiv: 1902.01843 (2019)


by Preetum Nakkiran et al at ArXiv: 1912.02292 (Dec. 2019)

2. Two models of double descent for weak features 
by Mikhail Belkin et al at ArXiv: 1903.07571 (March 2019)

3. Surprises in High-Dimensional Ridgeless Least SquaresInterpolation 
by Trevor Hastie et al at ArXiv:1903.08560 (Nov. 2019)

4. Reconciling modern machine learning practiceand the bias-variance trade-off
by Mikhail Belkin et al at ArXiv: 1812.11118 (Sept 2019)