MACHINE LEARNING METHODS AND BIG DATA - 2022/3
Module code: ECOM075
Module Overview
The availability of high-dimensionality data sets has raised new challenges. Often, for a cross section of n individuals we may observe p individual characteristics, covariates, with p > n; i.e., the number of covariates is larger than the sample size. In this situation, standard econometric techniques fail to work. The key point is that most of the observed covariates have no predictive power and so we want to eliminate them. Data reduction is performed via regularised methods. Machine Learning provides tools for data reduction and for making out-of-sample prediction in the presence of high-dimensionality data, imposing very little structure on the data. Throughout the course, we overview the most popular machine learning methods, such as ridge regressions, LASSO (Least Absolute Shrinkage and Selection Operator), Regression Tree, Random Forest, Boosting and Bagging.
Module provider
Economics
Module Leader
MANDILARAS Alexandros (Economics)
Number of Credits: 15
ECTS Credits: 7.5
Framework: FHEQ Level 7
Module cap (Maximum number of students): N/A
Overall student workload
Independent Learning Hours: 95
Lecture Hours: 22
Seminar Hours: 11
Captured Content: 22
Module Availability
Semester 2
Prerequisites / Co-requisites
NA
Module content
An Overview of Model Selection and Data Reduction Techniques  
Ridge Regression 
LASSO, Adaptive LASSO and Post-LASSO  
Regression Tree  
Random Forest 
Boosting and Bagging  
Out-of-Sample Prediction
Assessment pattern
| Assessment type | Unit of assessment | Weighting | 
|---|---|---|
| Coursework | Coursework | 30 | 
| Examination Online | Final Examination | 70 | 
Alternative Assessment
NA
Assessment Strategy
The assessment strategy is designed to provide students with the opportunity to demonstrate that they have achieved the modules learning outcomes.
Thus, the summative assessment for this module consists of:
A coursework assignment that allows students to address an empirical question that involves the use of big data by formalising a hypothesis of interest, selecting an appropriate data reduction and econometric estimation method, and writing their own code using a suitable coding program
An examination that allows students to demonstrate a comprehensive understanding of and ability to evaluate critically machine learning methods
Feedback Individual feedback will be provided on students work during the weekly seminars and when coursework marks are released.
Module aims
- Provide students with practical knowledge to conduct empirical research that involves big data
- Facilitate a comprehensive understanding of the various machine learning methods
- Enable students to evaluate machine learning methods
- Equip students with the skills to apply each method with real data
Learning outcomes
| Attributes Developed | ||
| 001 | Demonstrate a comprehensive understanding of and ability to evaluate critically machine learning methods | CK | 
| 002 | Address empirical questions involving the use of big data | CKPT | 
| 003 | Formalize a hypothesis of interest and select appropriate data reduction and econometric estimation methods | CK | 
| 004 | Write own computer code using a suitable software program | CKPT | 
| 005 | Use newly acquired knowledge and skills to write an MSc Dissertation on a topic related to the econometrics of big data | CKPT | 
Attributes Developed
C - Cognitive/analytical
K - Subject knowledge
T - Transferable skills
P - Professional/Practical skills
Methods of Teaching / Learning
The learning and teaching strategy is designed to enable students to achieve the module¿s learning outcomes. There will be two hours of lectures and one hour of seminar every week. Problem sets based on the methodological topics taught during lectures and computer-based exercises will be reviewed during seminars/tutorials. Students are expected to work on an assignment and actively participate in the seminar hour.
Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.
Reading list
https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: ECOM075
Other information
Surrey's Curriculum Framework is committed to developing graduates with strengths in Employability, Digital Capabilities, Global and Cultural Capabilities, Sustainability and Resourcefulness and Resilience. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas: Resourcefulness and Resilience Through the ability to solve problems under time constraint and write an applied assignment. Digital Capabilities Using suitable software to code
Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2022/3 academic year.