Surrey University Stag

MACHINE LEARNING METHODS AND BIG DATA - 2022/3

Module code: ECOM075

Module Overview

The availability of high-dimensionality data sets has raised new challenges. Often, for a cross section of n individuals we may observe p individual characteristics, covariates, with p > n; i.e., the number of covariates is larger than the sample size. In this situation, standard econometric techniques fail to work. The key point is that most of the observed covariates have no predictive power and so we want to eliminate them. Data reduction is performed via regularised methods. Machine Learning provides tools for data reduction and for making out-of-sample prediction in the presence of high-dimensionality data, imposing very little structure on the data. Throughout the course, we overview the most popular machine learning methods, such as ridge regressions, LASSO (Least Absolute Shrinkage and Selection Operator), Regression Tree, Random Forest, Boosting and Bagging.

Module provider

Economics

Module Leader

MANDILARAS Alexandros (Economics)

Number of Credits: 15

ECTS Credits: 7.5

Framework: FHEQ Level 7

JACs code:

Module cap (Maximum number of students): N/A

Overall student workload

Independent Learning Hours: 95

Lecture Hours: 22

Seminar Hours: 11

Captured Content: 22

Module Availability

Semester 2

Prerequisites / Co-requisites

NA

Module content

An Overview of Model Selection and Data Reduction Techniques
Ridge Regression
LASSO, Adaptive LASSO and Post-LASSO
Regression Tree
Random Forest
Boosting and Bagging
Out-of-Sample Prediction

Assessment pattern

Assessment type Unit of assessment Weighting
Coursework Coursework 30
Examination Online Final Examination 70

Alternative Assessment

NA

Assessment Strategy

The assessment strategy is designed to provide students with the opportunity to demonstrate that they have achieved the module¿s learning outcomes.

Thus, the summative assessment for this module consists of:

A coursework assignment that allows students to address an empirical question that involves the use of big data by formalising a hypothesis of interest, selecting an appropriate data reduction and econometric estimation method, and writing their own code using a suitable coding program

An examination that allows students to demonstrate a comprehensive understanding of and ability to evaluate critically machine learning methods

Feedback

Individual feedback will be provided on students¿ work during the weekly seminars and when coursework marks are released.

Module aims

  • Provide students with practical knowledge to conduct empirical research that involves big data
  • Facilitate a comprehensive understanding of the various machine learning methods
  • Enable students to evaluate machine learning methods
  • Equip students with the skills to apply each method with real data

Learning outcomes

Attributes Developed
001 Demonstrate a comprehensive understanding of and ability to evaluate critically machine learning methods CK
002 Address empirical questions involving the use of big data CKPT
003 Formalize a hypothesis of interest and select appropriate data reduction and econometric estimation methods CK
004 Write own computer code using a suitable software program CKPT
005 Use newly acquired knowledge and skills to write an MSc Dissertation on a topic related to the econometrics of big data CKPT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

Methods of Teaching / Learning

The learning and teaching strategy is designed to enable students to achieve the module¿s learning outcomes. There will be two hours of lectures and one hour of seminar every week. Problem sets based on the methodological topics taught during lectures and computer-based exercises will be reviewed during seminars/tutorials. Students are expected to work on an assignment and actively participate in the seminar hour.

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.

Reading list

https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: ECOM075

Other information

Surrey's Curriculum Framework is committed to developing graduates with strengths in Employability, Digital Capabilities, Global and Cultural Capabilities, Sustainability and Resourcefulness and Resilience. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas: Resourcefulness and Resilience Through the ability to solve problems under time constraint and write an applied assignment. Digital Capabilities Using suitable software to code

Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2022/3 academic year.