DATA SCIENCE PRINCIPLES AND PRACTICES - 2021/2
Module code: COMM054
In light of the Covid-19 pandemic the University has revised its courses to incorporate the ‘Hybrid Learning Experience’ in a departure from previous academic years and previously published information. The University has changed the delivery (and in some cases the content) of its programmes. Further information on the general principles of hybrid learning can be found at: Hybrid learning experience | University of Surrey.
We have updated key module information regarding the pattern of assessment and overall student workload to inform student module choices. We are currently working on bringing remaining published information up to date to reflect current practice during the academic year 2021/22.
This means that some information within the programme and module catalogue will be subject to change. Current students are invited to contact their Programme Leader or Academic Hive with any questions relating to the information available.
The module provides for coverage of a variety of statistical methods, including descriptive statistics and validating formulated hypotheses, as well as predictive analytics. The computational foundations and methods of importance to data science are also covered, along with consideration for relevant supporting software and tools, and the need for data science governance.
THORNE Tom (Computer Sci)
Number of Credits: 15
ECTS Credits: 7.5
Framework: FHEQ Level 7
JACs code: I100
Module cap (Maximum number of students): N/A
Overall student workload
Independent Learning Hours: 90
Tutorial Hours: 11
Laboratory Hours: 11
Guided Learning: 16
Captured Content: 22
Prerequisites / Co-requisites
Co-requisite: COMM055, Machine Learning and Data Mining
The module includes:
- An introduction to machine learning, covering the basic concepts of Machine Learning and Data Science, and supervised and unsupervised learning problems.
- Edison descriptors KU1.02.01, KU1.02.02, KU1.02.03, KU1.02.05
- The basics of probability theory necessary for data science, covering discrete and continuous random variables and common probability distributions used.
- Edison descriptors KU1.01.01, KU1.01.05
- Fundamentals of hypothesis testing, and consideration of multiple testing.
- Edison descriptors KU1.01.01, KU1.01.02, KU1.01.04
- Statistical inference for statistical models, applying maximum likelihood approaches.
- Edison descriptors KU1.01.01-05
- Bayesian statistics, Bayesian inference, and example applications.
- Edison descriptors KU1.01.01-05
- Linear regression and validation, including the bootstrap and crossvalidation.
- Edison descriptors KU1.01.01-05, KU1.05.04
- Python programming and Python libraries for data science.
- Edison skills DSDALANG02, DSVIZ01
|Assessment type||Unit of assessment||Weighting|
|Examination Online||24 HOUR ONLINE EXAM||70|
The assessment strategy is designed to provide students with the opportunity to demonstrate the ability to critically appreciate and apply statistical methods and (predictive) analytics. Thus, the summative assessment for this module consists of:
• A class test, mid-term, addressing LOs 1-4 in respect to module content covered up to a week prior to the test.
• An examination, evaluating all LOs with respect to both principles and practices of Data Science Principles and Practices Formative assessment Students will be guided to work on weekly tasks through lab exercises, the solutions to which will provide for feedback on understanding and practice, which will feed forward into the class test and the exam.
- This module aims to introduce students to the necessary background material in statistics and probability that underlie modern data science and machine learning, with applications to real world problems, and to provide students with practical experience in working with these tools.
|001||Understand the different classes of data science and machine learning problems, and be able to explain which class a problem belongs to. Edison skill SDSDA01||KCPT|
|002||Understand and apply probability theory as it relates to data science, including classes of random variables, independence, conditional probability and common probability distribution. Edison skill SDSDA02||KCPT|
|003||Choose and execute standard methods from existing Python statistical libraries to analyse data and visualise results. Edison skills DSDALANG02, SDSDA10, DSVIZ01||CPT|
|004||Be able to explain the difference between Bayesian and frequentist approaches, and interpret the output of a Bayesian data analysis. Edison skill SDSDA02||CPT|
|005||Apply linear regression and classification with appropriate validation. Edison skills SDSDA04, SDSDA08, SDSDA09||CPT|
C - Cognitive/analytical
K - Subject knowledge
T - Transferable skills
P - Professional/Practical skills
Methods of Teaching / Learning
The learning and teaching strategy is designed to provide students with the knowledge, skills, and practical experience covering the module aims and learning outcomes.
The learning and teaching methods include: 11 teaching weeks with each week comprising:
2 hours of lectures, to convey and discuss the key concepts and principles
1 hour tutorial, to discuss the material covered in lectures
1 hour lab session, to put key concepts and principles into practice.
Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.
Upon accessing the reading list, please search for the module using the module code: COMM054
Programmes this module appears in
|Data Science MSc||1||Compulsory||A weighted aggregate mark of 50% is required to pass the module|
Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2021/2 academic year.