DATA DRIVEN METHODS - 2024/5

Module code: MATM071

Module Overview

The presentations of the module will focus on data-driven methods for the analysis of dynamical systems and time-series data and on related machine learning problems such as dimensionality reduction, manifold learning, regression, and classification. 

Python will be used to implement data-driven methods. The methods will then be applied to typical benchmark problems such as chaotic dynamical systems, metastable stochastic systems, and fluid dynamics problems, but also, for instance, to image classification problems to highlight similarities with classical supervised learning applications.

Module provider

Mathematics & Physics

Module Leader

GODOLPHIN Janet (Maths & Phys)

Number of Credits: 15

ECTS Credits: 7.5

Framework: FHEQ Level 7

Module cap (Maximum number of students): N/A

Overall student workload

Independent Learning Hours: 76

Lecture Hours: 22

Laboratory Hours: 11

Guided Learning: 30

Captured Content: 11

Module Availability

Semester 2

Prerequisites / Co-requisites

None

Module content

Indicative content includes: 


  • General dimensionality reduction, classification, and clustering:
    We will first introduce methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), k-means, and Spectral Clustering.

  • Kernel-based methods and statistical learning theory:
    Reproducing kernel Hilbert space theory, Support Vector Machines, kernel-based variants of methods such as PCA and CCA, Kernel Ridge Regression, manifold learning.

  • Transfer operators and applications: Koopman and Perron-Frobenius operators and their generators (for the detection of metastable sets and the global analysis of dynamical systems), Markov State Models, Ulam's method, Dynamic Mode Decomposition (and extensions), Sparse Identification of Nonlinear Dynamics (learning governing equations from data), kernel-based methods for transfer operators.


Assessment pattern

Assessment type Unit of assessment Weighting
Coursework Assessed Coursework 1 20
Coursework Assessed Coursework 2 20
Coursework Assessed Coursework 3 20
Coursework Assessed Coursework 4 40

Alternative Assessment

NA

Assessment Strategy

The assessment strategy is designed to provide students with the opportunity to demonstrate: 


  • Analytical ability through the detailed expositions of the methods used in the coursework assignments.

  • Subject knowledge  through the usage of key definitions, theorems, and their proofs as appropriate.

  • An understanding of practical considerations when completing the coursework.

  • The ability to analyse data, to select suitable methods, and to interpret the numerical results.



Thus, the summative assessment for this module consists of:


  • 20% Coursework 1 - corresponding to Learning Outcomes 1 & 2.

  • 20% Coursework 2 – corresponding to Learning Outcomes 1 & 3.

  • 20% Coursework 3 – corresponding to Learning Outcomes 1 & 4.

  • 40% Coursework 4 – corresponding to Learning Outcomes 1, 2, 3, & 4.



Formative assessment and Feedback

Students receive written feedback via the marked coursework assignments over an 11 week period. Formative guidance is given on the coursework.

Module aims

  • Give students an overview of machine learning problems related to dynamical systems and equip students with a solid understanding of the mathematical tools required to analyse high-dimensional data sets.
  • Provide students with hands-on experience in data science and time series analysis.
  • Give students the opportunity to apply state-of-the-art machine learning tools to real-world problems.

Learning outcomes

Attributes Developed
001 Students will demonstrate systematic understanding of key aspects of selected topics within dynamical systems theory, data science, and statistical learning theory. KC
002 Students will demonstrate the capability to choose appropriate methods to analyse given data and to interpret the results. KCPT
003 Students will be able to demonstrate understanding of selected machine learning algorithms and capability to implement them from scratch, adapting to specific settings. KCPT
004 Students will have the capability to implement machine learning algorithms in Python and to use established machine learning libraries. KPT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

Methods of Teaching / Learning

The learning and teaching strategy is designed to provide:


  • A comprehensive treatment of principles and methods required for model reduction, system identification, and regression or classification problems.

  • Experience in problem solving and machine learning in general.

  • Practical experience in data science and time-series analysis.



The learning and teaching methods include:


  • 1 hour laboratories per week x 11 weeks. The lab sessions are used to reinforce material covered in lectures and to give students the opportunity to apply methods through practical programming exercises using Python. Students will gain experience in using Python to analyze benchmark problems from different application areas such as molecular dynamics, fluid dynamics, or biology.

  • 2 x 1 hour lectures per week x 11 weeks. The lectures provide a structured learning environment with opportunities for students to ask questions and to practice methods taught.

  • Assessed coursework to give students practical experience of implementing techniques covered in lectures and lab sessions in an extended piece of work.

  • Several pieces of unassessed coursework to give students experience of using techniques introduced in the module and to receive formative feedback.



Lectures may be recorded. Lecture recordings are intended to give students the opportunity to review parts of the session that they might not have understood fully and should not be seen as an alternative to attendance at lectures.

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.

Reading list

https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: MATM071

Other information

The School of Mathematics and Physics is committed to developing graduates with strengths in Digital Capabilities, Employability, Global and Cultural Capabilities, Resourcefulness and Resilience and Sustainability. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas:

Digital Capabilities: This is one of the foundational modules in data analysis that teaches students methods to deal numerically with data of different origins using one of the low-level computer languages. This involves learning to programming and learning to apply these skills to solve technical problems.

Employability: The ability to draw meaning from large data sets is currently an area that is in high demand in industry. This module teaches the mathematical foundations to allow students to work with large and complex real-world datasets. These skills are highly valuable to employers.

Global and Cultural Capabilities: Mathematics is a global language and the tools and languages used on this module can be used internationally. This module allows students to develop skills that will allow them to reason about and develop applications with global reach and collaborate with their peers around the world.

Resourcefulness and Resilience: This module involves practical problem-solving skills that teach a student how to work with complex and unstructured data sets. The foundational maths taught in this can be applied to a wide range of different scenarios, giving students new techniques for solving problems.

Sustainability: The data analysis skills learned in MATM0XX equip students with the skills to analyse data on resource consumption, emissions, and environmental impact, facilitating the development of sustainable practices. Thus, this module plays a role in creating a more sustainable future.

Programmes this module appears in

Programme Semester Classification Qualifying conditions
Mathematics with Statistics MMath 2 Optional A weighted aggregate mark of 50% and pass 80% of stations within the clinical skills portfolio is required to pass the module. The pass mark of the stations will use regression standard setting.
Mathematics with Statistics MMath 2 Optional A weighted aggregate mark of 50% is required to pass the module
Financial Data Science MSc 2 Compulsory A weighted aggregate mark of 50% is required to pass the module
Mathematics MMath 2 Optional A weighted aggregate mark of 50% is required to pass the module
Mathematics MSc 2 Optional A weighted aggregate mark of 50% is required to pass the module

Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2024/5 academic year.