MACHINE LEARNING METHODS AND BIG DATA - 2026/7

Module code: ECOM075

Module Overview

The module equips students with the recent machine and statistical learning toolkit for modern empirical economics and quantitative finance analysis. It is practical in orientation and rigorous in execution, developing the professional competence to implement state-of-the-art methods, interpret their output critically, and communicate findings to both technical and non-technical audiences - skills of direct relevance to careers in asset management, central banking, economic consultancy, and policy institutions.

The core techniques are introduced in sequence, each grounded in statistical theory and connected to professional practice.  For instance, students will study dimension reduction, encompassing regularisation methods such as ridge regression, the LASSO, the elastic net, and factor models. They will learn how these tools are used in asset management to select pricing factors and in central banks and forecasting institutions to handle large macroeconomic datasets. Students will study tree-based ensemble methods, including random forests and gradient boosting, and learn how they are applied in credit risk departments,  quantitative hedge funds, and policy research to capture non-linear relationships that standard econometric models miss. Students will also study kernel methods, including support vector machines and generalised additive models, and learn how these are used wherever an economic relationship requires a flexible, interpretable specification that does not impose an arbitrary functional form. Finally, students will study rigorous model evaluation - cross-validation, multiple-testing corrections, and model confidence sets - and understand why these procedures are a professional requirement in any setting where analytical findings inform decisions.

Throughout, statistical foundations are developed carefully alongside applications, so that students understand not only how to use each method but when it will fail and at what cost.

Module provider

Economics

Module Leader

SESSINOU Rosnel (Economics)

Number of Credits: 15

ECTS Credits: 7.5

Framework: FHEQ Level 7

Module cap (Maximum number of students): N/A

Overall student workload

Workshop Hours: 11

Independent Learning Hours: 65

Lecture Hours: 22

Guided Learning: 30

Captured Content: 22

Module Availability

Semester 2

Prerequisites / Co-requisites

NA

Module content

The module is organised around interconnected methodological pillars, each developed from statistical first principles and motivated by substantive economic and financial applications. Coverage and sequencing may be adapted to reflect recent developments and the interests of the cohort:

  1. Dimension Reduction and Regularisation
  2. Tree-Based Methods and Ensemble Learning
  3. Kernel Methods and Non-Parametric Estimation
  4. Model Assessment and Statistical Inference

Assessment pattern

Assessment type Unit of assessment Weighting
Project (Group/Individual/Dissertation) Coursework 50
Project (Group/Individual/Dissertation) Coursework 50

Alternative Assessment

NA

Assessment Strategy

The assessment strategy is designed to provide students with the opportunity to demonstrate that they have achieved the modules learning outcomes.

Thus, the summative assessment for this module consists of two projects that allow students to address two empirical questions that involve the use of big data by formalising a hypothesis of interest, selecting an appropriate data reduction and machine learning method, and writing their own code using a suitable coding program with GenAI assistance. Part of either project can include some theoretical work that allows students to demonstrate a comprehensive understanding of and ability to evaluate critically machine learning methods.

Feedback Individual feedback will be provided on students work during the weekly seminars and when coursework marks are released.

Module aims

  • Provide a rigorous grounding in the statistical theory underpinning leading data-science and machine-learning methods, enabling students to use these tools critically and with informed awareness of their assumptions and limitations.
  • Develop practical competence in applying regularisation, dimension-reduction, ensemble, and non-parametric methods to high-dimensional economic and financial datasets.
  • Cultivate principled habits of empirical practice - encompassing reproducibility, rigorous out-of-sample evaluation, transparent reporting, and honest quantification of uncertainty.
  • Guide students through the complete cycle of an applied research project: from question formulation and data curation through modelling, evaluation, and the written presentation of findings on an approved topic.
  • Encourage critical reflection on the limitations and ethical dimensions of data-driven approaches - including overfitting, distributional instability, interpretability constraints, and algorithmic discrimination - equipping students to deploy these tools responsibly.

Learning outcomes

Attributes Developed
001 Demonstrate a systematic understanding of the statistical foundations of leading data-science and machine-learning methods, including their assumptions, properties, and known failure modes, and articulate how these bear on their suitability for economic and financial applications. KC
002 Select and competently apply appropriate methods - drawn from regularisation, dimension reduction, ensemble learning, kernel frameworks and and artificial intelligence techniques such as neural networks - to high-dimensional economic and financial datasets, demonstrating sound judgement in method choice and parameterisation. KCPT
003 Implement analytical workflows proficiently using suitable computational tools, producing well-documented, reproducible analyses that meet professional standards of clarity and rigour. PT
004 Evaluate empirical models rigorously through principled out-of-sample validation, identify and address multiple-testing concerns, and quantify uncertainty in a manner that supports sound econometric inference. KCP
005 Formulate a research question, select appropriate data and methods, and produce a written research report on an approved topic to a standard consistent with professional and academic practice. KCPT
006 Critically reflect on the limitations and ethical dimensions of data-driven modelling - including overfitting, interpretability, distributional instability, and algorithmic bias - and demonstrate awareness of the professional responsibilities these entail. CT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

Methods of Teaching / Learning

The learning and teaching strategy is designed to enable students to achieve the module's learning outcomes.

Lectures and workshops.

Problem sets based on the methodological topics taught during lectures and computer-based exercises will be reviewed during workshops.

Students are expected to work on an assignment and actively participate in the workshops.

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.

Reading list

https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: ECOM075

Other information

Surrey's Curriculum Framework is committed to developing graduates with strengths in Employability, Digital Capabilities, Global and Cultural Capabilities, Sustainability and Resourcefulness and Resilience. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas: Resourcefulness and Resilience Through the ability to solve problems under time constraint and write an applied assignment. Digital Capabilities Using suitable software to code

Programmes this module appears in

Programme Semester Classification Qualifying conditions
Financial Data Science MSc 2 Optional A weighted aggregate mark of 50% is required to pass the module
Economics (Econometrics and Big Data) MSc 2 Compulsory A weighted aggregate mark of 50% is required to pass the module
Economics MSc 2 Optional A weighted aggregate mark of 50% is required to pass the module
Economics and Finance MSc 2 Optional A weighted aggregate mark of 50% is required to pass the module

Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2026/7 academic year.