# DATA SCIENCE FOR DYNAMICAL SYSTEMS - 2023/4

Module code: MATM062

## Module Overview

The presentations of the module will focus on data-driven methods for the analysis of dynamical systems and time-series data and on related machine learning problems such as dimensionality reduction, manifold learning, regression, and classification.

Example topics include:
Learning governing equations from data; Transfer operators; Dynamic mode decomposition; Reproducing kernel Hilbert spaces; Support vector machines; Kernel ridge regression.

Python or Matlab will be used to implement data-driven methods. The methods will then be applied to typical benchmark problems such as chaotic dynamical systems, metastable stochastic systems, and fluid dynamics problems, but also, for instance, to image classification problems to highlight similarities with classical supervised learning applications.

### Module provider

Mathematics & Physics

VYTNOVA Polina (Maths & Phys)

### Module cap (Maximum number of students): 20

Independent Learning Hours: 106

Lecture Hours: 22

Laboratory Hours: 11

Captured Content: 11

Semester 2

N/A

## Module content

Indicative content includes:

General dimensionality reduction, classification, and clustering:
We will first introduce methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), k-means, and Spectral Clustering.

Kernel-based methods and statistical learning theory:
Reproducing kernel Hilbert space theory, Support Vector Machines, kernel-based variants of methods such as PCA and CCA, Kernel Ridge Regression, manifold learning.

Transfer operators and applications:
Koopman and Perron-Frobenius operators and their generators (for the detection of metastable sets and the global analysis of dynamical systems), Markov State Models, Ulam's method, Dynamic Mode Decomposition (and extensions), Sparse Identification of Nonlinear Dynamics (learning governing equations from data), kernel-based methods for transfer operators.

## Assessment pattern

Assessment type Unit of assessment Weighting
Coursework Coursework 1 20
Coursework Coursework 2 20
Coursework Coursework 3 20
Coursework Coursework 4 40

N/A

## Assessment Strategy

The assessment strategy is designed to provide students with the opportunity to demonstrate: Analytical ability by solution of unseen problems in the exam. Subject knowledge through the recall of key definitions, theorems, and their proofs. An understanding of practical considerations when completing the coursework. The ability to analyse data, to select suitable methods, and to interpret the numerical results.

Thus, the summative assessment for this module consists of:

• Coursework 20%

• Coursework 20%

• Coursework 20%

• Coursework 40%

Formative assessment and feedback:

Students receive written feedback via a number of marked unassessed coursework assignments over an 11 week period. Formative guidance is given on the coursework.

## Module aims

• Give an overview of machine learning problems related to dynamical systems and equip students with a solid understanding of the mathematical tools required to analyse high-dimensional data sets.
• Provide students with hands-on experience in data science and time series analysis.
• Give students the opportunity to apply state-of-the-art machine learning tools to real-world problems.

## Learning outcomes

 Attributes Developed 001 Demonstrate systematic understanding of key aspects of selected topics within dynamical systems theory, data science, and statistical learning theory. KC 002 Demonstrate the capability to apply data-driven methods to simulation and measurement data and to interpret the results. KCPT 003 Ability to understand and assess related machine learning methods and their applications and limitations. KCPT 004 Capability to implement machine learning algorithms in Matlab/Python and to use established machine learning libraries. KPT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

## Methods of Teaching / Learning

The learning and teaching strategy is designed to provide:

i) A comprehensive treatment of principles and methods required for model reduction, system identification, and regression or classification problems.

ii) Experience in problem solving and machine learning in general.

iii) Practical experience in data science and time-series analysis.

The learning and teaching methods include:

i) lectures and computer lab sessions. Students will gain experience in using Matlab/Python to analyse benchmark problems from different application areas such as molecular dynamics, fluid dynamics, or biology.

ii) Assessed coursework to give students practical experience in implementing techniques covered in lectures and lab sessions.

iii) Several unassessed exercise sheets to give students experience in using techniques introduced in the module and to receive feedback.

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.