DATA SCIENCE PRINCIPLES AND PRACTICES - 2020/1
Module code: COMM054
In light of the Covid-19 pandemic, and in a departure from previous academic years and previously published information, the University has had to change the delivery (and in some cases the content) of its programmes, together with certain University services and facilities for the academic year 2020/21.
These changes include the implementation of a hybrid teaching approach during 2020/21. Detailed information on all changes is available at: https://www.surrey.ac.uk/coronavirus/course-changes. This webpage sets out information relating to general University changes, and will also direct you to consider additional specific information relating to your chosen programme.
Prior to registering online, you must read this general information and all relevant additional programme specific information. By completing online registration, you acknowledge that you have read such content, and accept all such changes.
The module provides for coverage of a variety of statistical methods, including descriptive statistics and validating formulated hypotheses, as well as predictive analytics. The computational foundations and methods of importance to data science are also covered, along with consideration for relevant supporting software and tools, and the need for data science governance.
THORNE Thomas (Computer Sci)
Number of Credits: 15
ECTS Credits: 7.5
Framework: FHEQ Level 7
JACs code: I100
Module cap (Maximum number of students): N/A
Prerequisites / Co-requisites
Co-requisite: COMM055, Machine Learning and Data Mining
Indicative content includes: Mathematical & Statistical foundations and methods EDISON descriptors: KU1.01.01-KU1.01.05, KU1.01.08, KU1.01.09, KU1.01.12, KU1.01.13
• Logic, Probability & Statistics
• Logical and Probabilistic representations and reasonings (causal networks, Bayesian analysis, Markov nets
• Statistical methods (regression, time series, dimensionality, clusters, frequentist and Bayesian statistics
• Stochastic methods (Markov models, Markov networks, Gausian models
• Hybrid methods (Markov logic networks, Stochastic logic, Probabilistic logic)
• Statistical tests & performance analysis Computational foundations and methods EDISON descriptors: KU1.01.11, KU1.01.12, KU1.03.07, KU1.04.01, KU1.04.07, KU1.05.03
• Information theory (entropy, compression, etc.)
• Data curation, data modeling and data management
• Machine learning (qualitative, descriptive and predictive analytics
• Scaling-up issues (Big data and streams)
• Graph analytics
• Unstructured data, text-mining and sentiments analysis Software and tools EDISON descriptors: KU1.01.14
• A Data Science Ecosystem
• Programming Languages & Libraries (e.g. R, Python)
• Visual & interactive front-ends (e.g. SPSS Modeller)
• Matematical software Governance of Data Science
• Human factors and human-centred Data Science
• Safety and security
• Privacy and Ethics Domain knowledge and applications, for example for Business Data Science, Medical & healthcare data science, Scientific Data Science & Knowledge Discovery
|Assessment type||Unit of assessment||Weighting|
|School-timetabled exam/test||CLASS TEST (1.5 HOURS)||20|
|Examination||EXAMINATION (2 HOURS)||80|
The assessment strategy is designed to provide students with the opportunity to demonstrate the ability to critically appreciate and apply statistical methods and (predictive) analytics. Thus, the summative assessment for this module consists of:
• A class test, mid-term, addressing LOs 1-4 in respect to module content covered up to a week prior to the test.
• An examination, evaluating all LOs with respect to both principles and practices of Data Science Principles and Practices Formative assessment Students will be guided to work on weekly tasks through lab exercises, the solutions to which will provide for feedback on understanding and practice, which will feed forward into the class test and the exam.
- This module aims to:
elaborate, demonstrate, and apply (statistical) principles and approaches to data, and establish methods and tools that provide for fundamental, and appropriately governed, treatment of such data.
|001||Apply designated quantitative techniques, such as statistics, time series analysis, optimization, and simulation to deploy appropriate models for analysis and prediction (DSDA02 [refined])||CKPT|
|002||Understand and use different performance and accuracy metrics for model validation in analytics projects and hypothesis testing (DSDA04 [refined]) .||CKPT|
|003||Choose and execute standard methods from existing statistical libraries to provide overview (LODA.02 L1)||CPT|
|004||Select most appropriate statistical techniques and model available data to deliver insights (LODA.02 L2)||CPT|
|005||Compare and choose performance and accuracy metrics (LODA.04 L2)||CPT|
C - Cognitive/analytical
K - Subject knowledge
T - Transferable skills
P - Professional/Practical skills
Overall student workload
Independent Study Hours: 106
Lecture Hours: 33
Laboratory Hours: 11
Methods of Teaching / Learning
The learning and teaching strategy is designed to provide students with the knowledge, skills, and practical experience covering the module aims and learning outcomes.
The learning and teaching methods include: 11 teaching weeks with each week comprising:
3 hour lectures, to convey and discuss the key concepts and principles
1 hour lab sessions, to put key concepts and principles into practice.
Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate.
Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.
Upon accessing the reading list, please search for the module using the module code: COMM054
Programmes this module appears in
|Data Science MSc||1||Compulsory||A weighted aggregate mark of 50% is required to pass the module|
Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2020/1 academic year.