Research Experience For Undergraduates Summer School On Mathematical .

Transcription

Research Experience for UndergraduatesSummer School on Mathematical Foundation of Data ScienceJune 6, 2022 --- July 15, 2022Join Virtual Zoom Programhttps://us06web.zoom.us/j/84400970067?pwd R2Rpb2ZnSldESmJGT2NzMW1XMlNpdz09Meeting ID: 844 0097 0067Passcode: 718281Sponsored byDepartment of MathematicsUniversity of South CarolinaNational Sciences FoundationRTG award DMS 2038080Organized byProf. Linyuan Lu,Prof. Wuchen Li,Prof. Qi Wang,Prof. Zhu Wang

Table of ContentsSECTION 1: PROGRAM OVERVIEW . 3SECTION 2: COURSE MODULARS. 4COURSE MODULE 1: LINEAR ALGEBRA . 4COURSE MODULE 2: PROBABILITY THEORY AND OPTIMIZATION. 4COURSE MODULE 3: INTRODUCTION TO COMPLEX NETWORKS . 4SECTION 3: RESEARCH PROJECTS . 5RESEARCH PROJECTS IN DATA-DRIVEN REDUCED ORDER MODELING. 5RESEARCH PROJECTS IN COMPLEX GRAPHS . 5RESEARCH PROJECTS IN TRANSPORT INFORMATION LEARNING . 5RESEARCH PROJECTS IN DYNAMICAL SYSTEM LEARNING USING TIME -SERIES DATA. 6SECTION 3: PROGRAM CALENDAR . 7WEEK 1 (W EEK OF JUNE 6-10): SHORT COURSES . 7WEEK 2 (W EEK OF JUNE 13-17): SHORT COURSES. 7WEEK 3 (W EEK OF JUNE 20-24): INTRODUCTION OF PROJECTS , GROUP DISCUSSIONS ON RESEARCH PROJECTS , ANDGUEST LECTURES IN DATA SCIENCES . 8WEEK 4 (W EEK OF JUNE 27 - JULY 1): INTRODUCTION OF PROJECTS , GROUP DISCUSSIONS ON RESEARCH PROJECTS, ANDGUEST LECTURES IN DATA SCIENCES . 9WEEK 5 (W EEK OF JULY 4-8): INTRODUCTION OF PROJECTS , GROUP DISCUSSIONS ON RESEARCH PROJECTS , AND GUESTLECTURES IN DATA SCIENCES . 9WEEK 6 (W EEK OF JULY 11-15): INTRODUCTION OF PROJECTS , GROUP DISCUSSIONS ON RESEARCH PROJECTS , AND GUESTLECTURES IN DATA SCIENCES . 10SITING LIUUCLASITING 6@MATH .UCLA.EDUHTTPS ://SITES .GOOGLE.COM/VIEW /SITING 6UCLA/HOME. 11SECTION 4: PARTICIPANTS CONTACT INFORMATION . 11FACULTY . 11POSTDOC . 11GRADUATE ASSISTANTS . 11UNDERGRADUATE STUDENTS . 11

Section 1: Program OverviewThis REU summer program is part of the NSF RTG project “RTG: Mathematical Foundationof Data Science at University of South Carolina”, which aims to develop a multi-tier ResearchTraining Program at the University of South Carolina (UofSC) designed to prepare the futureworkforce in a multidisciplinary paradigm of modern data science. The education and trainingmodels will leverage knowledge and experience already existing among the faculty and bring innew talent to foster mathematical data science expertise and research portfolios through avertical integration of post-doctoral research associates, graduate students, undergraduatestudents, and advanced high school students. A primary focus of this project is to recruit andtrain U.S. Citizens, females, and underrepresented minority (URM) among undergraduate andgraduate students, and postdocs through research led training in Data Science.For more information on the NSF RTG project, please visit us at the following URL:https://sc.edu/study/colleges schools/artsandsciences/mathematics/my mathematics/rtg/index.phpThe REU summer program of this year runs virtually from June 6 to July 15. In the first twoweeks, we teach four short course modules in Mathematical Foundation of Data Science toprepare undergraduate students for the basic level of research projects. Starting from the thirdweek, students will be divided into four groups to work on research projects. Some guestspeakers are invited to give talks on the latest development in the Mathematical Foundation ofData Science. On the last day of the program, students will present their research findings.

Section 2: Course ModularsCourse module 1: Linear AlgebraInstructor: Zhu WangTotal hours: 10Course Contents: Understand fundamental concepts in linear algebra, such as subspaces,projections, least squares, eigenvalue decomposition, and singular value decomposition, etc.Apply these concepts in solving the following central problems in linear algebra: n by n linearsystem Ax b; m by n linear system Ax b; n by n linear system Ax 𝜆x; and m by n linearsystem Av 𝜎u. The connection of linear algebra with many applications will be discussed aswell.Course module 2: Probability Theory and OptimizationInstructor: Wuchen LiTotal hours: 10Course Contents: Study basic concepts in probability, statistics and optimizations: Probabilitydistributions. Cumulative distributions. Moments. Mean. Variance. Covariance. Gaussiandistribution. Samples. Fisher information matrix. Optimal conditions. Convexities. Gradientdescent. Newton’s method. Lagrange multiplier. KKT conditions.Course module 3: Introduction to Complex NetworksInstructor: Linyuan LuTotal hours: 10Course Contents:Graphs, trees, subgraphs, graph isomorphisms, paths, walks, cycles, graph product, planargraphs, Euler formula, Kuratowski’s theorem, adjacency matrix, spectrum of special graphs,combinatorial Laplacian, matrix tree theorem, normalized Laplacian, Power law graphs, randomgraphs, Erdos-Renyi random graphs, random graphs for power law graphs, spectrum of randomgraphs, transportation distance, Ricci curvature of graphs, Concetration of Lipschitz functionsover positive curvature graphs.Course module 4: Machine LearningInstructor: Qi WangTotal hours: 10Introduce the basic concept in machine learning, especially, to make a distinction between machinelearning and optimization of an objective function or loss function. Discuss how to define the lossfunction using maximum likelihood estimation and Bayesian estimation. Introduce some basicmachine learning algorithms such as logistic regression, k-means clustering, k-nearest neighbors,support vector machines, and decision trees. Introduce neural networks and deep learning: machine

learning using deep neural networks, including fully connected convolutional and recurrent neuralnetworks. Discuss some deep learning methods of learning dynamical systems underlying giventime-series.Section 3: Research ProjectsResearch projects in data-driven reduced order modeling1. Dimensionality reduction in the parameter space. Study the classic linear dimensionalityreduction approaches such as principal component analysis (PCA) and active subspace(AS), and recently developed deep learning methods for reducing the parameter spacesuch as the nonlinear level-set learning (NLL) method. Compare their performances byconsidering high-dimensional function approximation problems and the numericalsimulations of differential equations.2. Data-driven reduced order modeling. Study the traditional model reduction approachessuch as proper orthogonal decomposition (POD) or reduced basis method (RBM), andlatest developments on deep learning-based nonlinear model reductions for overcomingthe Kolmogorov barrier, such as those based on autoencoders. Compare theirperformances when simulating convection-dominated phenomena.Research projects in complex graphsA graph G is k-existentially closed (k-e.c.) if each k-set of vertices can be extended in allof the possible 2𝑘 ways. Let 𝑚𝑒𝑐(𝑘) be the minimum integer n such that a k-e.c. graph onn vertices exists. It is known that 𝑚𝑒𝑐 (1) 4, 𝑚𝑒𝑐 (2) 9 and 24 𝑚𝑒𝑐(3) 28 .Improve the bounds of 𝑚𝑒𝑐 (3).2. For each integer d, let F(d) (or f(d)) be the maximum integer n such that there exists aconnected graph on n vertices with positive curvatures and maximum degree d (or d2regular graph respectively). It is known that 𝑐1𝑑 𝑓(𝑑) 𝐹 (𝑑) 𝑑 𝑐2𝑑 . Determine themagnitude of F(d) and f(d).3. Classify all planar d-regular graphs with positive curvatures.1.Research projects in transport information learningStudy and understand natural gradient methods from information geometry and optimaltransport. Implement the natural gradient algorithms for supervised learning problems, andunsupervised learning problems.

1. In one-dimensional space, compute and implement the Fisher and Wasserstein informationmatrix for Gaussian and exponential distributions. Then, implement the natural gradient methodsto learn the parameters.2. In discrete graphical models, compute and implement the Wasserstein natural gradientmethods for learning parameters in Boltzmann machines.3. In two-layer neural network models, compute and implement the Wasserstein informationmatrix and its induced natural gradient dynamics.Research projects in dynamical system learning using time-series dataPhysical laws and mechanisms in most real-world systems are formulated as time evolutionaryequations known as the dynamical systems, which are either given as a discrete or continuoussystem. Measurements or outputs of the systems are customarily given in time series. Eithersolving the dynamical systems for a given initial data or learning the dynamical system withgiven measured dynamical system data (solutions) are important data science and machinelearning problems. Here are some simplified projects related to machine learning of dynamicalsystems.1. Survey the machine learning methods for solving dynamical systems and then develop moreefficient machine learning algorithms for solving simple dynamical systems exploiting thefundamental structure and property of the underlying dynamical systems.2. Survey model learning using deep neural networks and develop dynamical system models forgiven time-series data. Example, 1. learning patient-specific metabolic panel dynamics for lungcancer patients with 10 patient data. 2. Design diagnostic models for septic patients based onpatient’s time series data, etc.3. Explore the power of dimension reduction in deep learning of dynamical systems. Using orderreduction methods such as encoder/decoder to transform time series data to low dimensionallatent space and then develop approximate models in the latent space.This is an LSTM unit for discrete dynamical systems.

Section 3: Program CalendarWeek 1 (Week of June 6-10): Short CoursesDayMondayJune 6TuesdayJune 7WednesdayJune 1:00-12:0012:00-2:002:00-4:004:00-5:00ThursdayJune FridayJune 0ActivityWelcome and orientationLinear AlgebraLunch BreakProbability Theory and OptimizationRecitationLinear AlgebraMath Programming LabIntroduction to Complex NetworksRecitationLinear Algebra and Deep LearningMath Programming LabLunch BreakProbability Theory and OptimizationRecitationLinear AlgebraMath Programming LabLunch BreakIntroduction to Complex NetworksRecitationLinear AlgebraMath Programming LabLunch BreakProbability Theory and OptimizationSocial Activity HourInstructor/moderatorLuZ. WangLiTom?Z. WangLuZ. WangLiTom?Z. WangLuBrooksZ. WangBrooksLiMeganWeek 2 (Week of June 13-17): Short CoursesDayMondayJune 0ActivityDeep LearningMath Programming LabLunch BreakIntroduction to Complex NetworksRecitationInstructor/moderatorQ. WangLuThompson

TuesdayJune 14WednesdayJune 15ThursdayJune 16FridayJune ep LearningMath Programming LabLunch BreakProbability Theory and OptimizationRecitationDeep LearningMath Programming LabLunch BreakIntroduction to Complex NetworksRecitationDeep LearningMath Programming LabLunch BreakProbability Theory and OptimizationRecitationDeep LearningMath Programming LabLunch BreakIntroduction to Complex NetworksSocial Activity HourQ. WangLiTom?Q. WangLuBrooksQ. WangLiMcKenzieQ. WangBrooksLuMeganWeek 3 (Week of June 20-24): Introduction of projects, group discussions on research projects,and guest lectures in data sciencesInstructor/DayActivitymoderatorProject introductionProfessors9:00-12:00MondayGrouping students into parallelJune 203:00-5:00discussionsParallel research sessionsGAs9:00-10:00TuesdaySelf-research time10:00-12:00June 21Mentors’ Office hoursProfessors2:00-3:003:00-5:00Self-research timeParallel research sessions9:00-10:00Self-research time10:00-12:00WednesdayJune 22Guest lecture2:00-3:003:00-5:00Parallel research sessionsParallel research sessions9:00-10:00Self-research time10:00-12:00ThursdayJune 23Mentors’ Office hoursProfessors2:00-3:00Self-research time3:00-5:00FridayParallel and joint research sessionsAll9:00-12:00

June 242:00-4:00Social Activity HourMcKayWeek 4 (Week of June 27 - July 1): Introduction of projects, group discussions onresearch projects, and guest lectures in data sciencesDayMondayJune 27TuesdayJune 28WednesdayJune 29ThursdayJune 30FridayJuly 0-3:003:00-5:009:00-12:002:00-4:00Parallel research sessionsSelf-research timeGuest lectureParallel research sessionsParallel research sessionsSelf-research timeMentors’ Office hoursSelf-research timeParallel research sessionsSelf-research timeGuest lectureParallel research sessionsParallel research sessionsSelf-research timeMentors’ Office hoursSelf-research timeParallel and joint research sessionsSocial Activity k 5 (Week of July 4-8): Introduction of projects, group discussions on researchprojects, and guest lectures in data sciencesDayMondayJuly 4TuesdayJuly -3:003:00-5:009:00-10:00No ActivityParallel research sessionsSelf-research timeMentors’ Office hoursParallel research sessionsParallel research sessionsInstructor/moderator

July 02:00-3:003:00-5:009:00-12:002:00-4:00ThursdayJuly 7FridayJuly 8Self-research timeGuest lectureParallel research sessionsParallel research sessionsSelf-research timeMentors’ Office hoursParallel research sessionsParallel and joint research sessionsSocial Activity HourMcKayWeek 6 (Week of July 11-15): Introduction of projects, group discussions on research projects,and guest lectures in data sciencesInstructor/DayActivitymoderatorParallel research sessions9:00-10:00Self-research timeMonday10:00-12:00Guest lectureJuly 112:00-3:00Parallel research sessions3:00-5:00Parallel research sessions9:00-10:00Self-research timeTuesday10:00-12:00July 12Mentors’ Office hours2:00-3:003:00-5:00Self-research timeParallel research sessions9:00-10:00Self-research timeWednesday10:00-12:00July 13Guest lecture2:00-3:003:00-5:00Parallel research sessionsParallel research sessions9:00-10:00Self-research time10:00-12:00ThursdayJuly 14Mentors’ Office hours2:00-3:00Self-research time3:00-5:00Plenary lecture10:00-11:00FridayPlenary lectureJuly 1511:00-12:00Group Reporting and presentation1:00-3:003:00-4:00AssessmentConfirmed speakers:Stanley OsherUCLAsjo@math.ucla.eduYunan YangZTH/Cornellyyn0410@gmail.comPeng ChenUT u/ sjo/https://users.oden.utexas.edu/ peng/

Alex onnurbekian@gmail.comSiting LiuUCLALevon Nurbekyan UCLASamy Wu du/ lnurbekColorado School of Mines swuf ung@mines.eduhttps://swuf ung.github.io/Section 4: Participants contact informationFacultyLinyuan LuWuchen LiQi WangZhu zhu/PostdocWilliam LInzwlinz2@illinois.eduGraduate AssistantsGeorge BrooksAlec HelmThomas HamoriMegan McKayBlack McKenzieJoshua c.edujoshuact@email.sc.eduUndergraduate studentsBryson BoastCade StanleyDavid LiuDezmon dliu@umich.edudpatten@email.sc.eduBerry CollegeUniversity of South CarolinaUniversity of MichiganUniversity of South CarolinaJackson GinnJacob RottenbergJasdeep SinghJillian GarzarellaJohn RyanLeah ManganoLuke HammerMalcolm GaynorPeter LuoROHIT mail.comlmangano@email.sc.eduluke lege.harvard.eduroswain2002@gmail.comUniversity of South CarolinaUniversity of MassachusettsUniversity of South CarolinaUniversity of South CarolinaUniversity of South CarolinaUniversity of South CarolinaBrown UniversityKenyon CollegeHarvard UniversityUniversity of South Carolina

Sabrina BarratTOBIN OTTOXingcheng RenZelong LiZhiyuan edulizelong831@ucla.eduzfl5150@psu.eduSaginaw Valley State UniversityOberlin CollegeUniversity of South CarolinaUniversity of California, Los AngelesPennsylvania State University

efficient machine learning algorithms for solving simple dynamical systems exploiting the fundamental structure and property of the underlying dynamical systems. 2. Survey model learning using deep neural networks and develop dynamical system models for given time-series data. Example, 1. learning patient-specific metabolic panel dynamics for lung