PRINCIPLES OF DATA SCIENCE - 2024/5
Module code: MATM063
Module Overview
This module introduces programming in Python for data science, with a focus on data pre-processing, data mining and analysis, machine learning and deep learning. Besides the practical hands-on experience with writing code, this course also covers the theoretical background on different data analysis techniques and machine learning approaches. The goal is to develop an understanding of how information can be extracted from data and how this information can be further used to make predictions, but importantly how this is done practically in terms of writing clear and transparent source code. Using real-world data sets and illustrative examples, this course will help to develop a theoretical understanding of data science as well as practical experience by developing useful software tools. Many of the techniques acquired through this module are likely to be of potential use in the dissertation project.
Module provider
Mathematics & Physics
Module Leader
BAUER Werner (Maths & Phys)
Number of Credits: 15
ECTS Credits: 7.5
Framework: FHEQ Level 7
Module cap (Maximum number of students): N/A
Overall student workload
Independent Learning Hours: 77
Laboratory Hours: 22
Guided Learning: 45
Captured Content: 6
Module Availability
Semester 2
Prerequisites / Co-requisites
None.
Module content
Introduction to Python
Indicative contents include: insights into the structure of Python such as objects, instances, attributes, functions, classes, definitions, and so on; the use of packages useful for data science and machine learning; scientific code structuring; loading of python packages; loading and storing of data files; plotting and debugging codes; working on the titanic data set to conduct data pre-processing; and data analysis.
Machine learning, neural networks, and decision trees
Indicative contents include: machine learning methods, neural networks, decision trees, regression modeling, self-organizing maps, deep learning; practical applications of machine learning such as using pandas for data analysis; applying PyTorch, TensorFlow to train neural; studying on how to predict the outcome of the titanic data set; and other data sets by using machine learning methods.
Assessment pattern
Assessment type | Unit of assessment | Weighting |
---|---|---|
Coursework | Mid-term coursework | 40 |
Coursework | End-of-term coursework | 60 |
Alternative Assessment
N/A
Assessment Strategy
The assessment strategy is designed to provide students with the opportunity to demonstrate:
- Their understanding of the use of Python as a scientific language to solve problems in data science.
- Their ability to extract valuable information from the results of their data analysis.
- Their skills in using machine learning methods to predict outcomes.
- Their ability to write well-structured, reusable functional code.
Thus, the summative assessment for this module consists of:
One shorter coursework at the middle of the semester; weighted at 40% of the module mark. Covering learning outcomes 1-4.
One substantial coursework to be submitted towards the end of the semester; weighted at 60% of the module mark. Covering learning outcomes 1-4.
Formative assessment and feedback
Students receive written feedback via a number of marked unassessed coursework assignments over an 11 week period. Formative guidance is given on the coursework.
Module aims
- To equip students with the skills to program in Python for data science, data analysis, machine learning, and other data-related applications.
- To equip students with the skills to extract information out of large data sets.
- To equip students with the skills to make predictions of certain events by using machine learning and artificial intelligence.
Learning outcomes
Attributes Developed | ||
001 | Students will be able to demonstrate the ability of using Python for scientific computing, data analysis, machine learning, and data assimilation. | KPT |
002 | Students will be able to demonstrate the capability to apply data analysis tools and to interpret the results and show a systematic understanding of key aspects of selected topics within data science and statistical learning theory. | KCPT |
003 | Students will be able to demonstrate the ability to understand and assess related data science and machine learning methods and their applications and limitations. | KCPT |
004 | Students will be able to demonstrate the capability to implement machine learning algorithms and to use established libraries. | PT |
Attributes Developed
C - Cognitive/analytical
K - Subject knowledge
T - Transferable skills
P - Professional/Practical skills
Methods of Teaching / Learning
The learning and teaching strategy is designed to provide:
- A comprehensive introduction to Python with a focus on giving students the experience in implementing Python codes for problem solving in mathematics.
- Experience in programming in Python to tackle problems in data science, to enhance students’ digital capabilities and employability.
- Practical experience in analysing data to extract valuable information, to enhance student resourcefulness and resilience.
The learning and teaching methods include:
- A flipped classroom approach, with videos prepared in advance that cover the theoretical background on the topics. This will allow for opportunities of lively discussion on how theoretical ideas can be implemented in practical situations.
- 2 x 1 hour computer laboratories per week x 11 weeks. This will include discussions on the content of the video lectures and their applications through practical programming exercises.
- Assessed coursework to give students practical experience of implementing techniques covered in lectures and lab sessions in an extended piece of work.
- Several pieces of unassessed coursework to give students experience of using techniques introduced in the module and to receive formative feedback.
- Laboratories may be recorded. Laboratory recordings are intended to give students the opportunity to review parts of the session that they might not have understood fully and should not be seen as an alternative to attendance at lab sessions.
Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.
Reading list
https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: MATM063
Other information
The School of Mathematics and Physics is committed to developing graduates with strengths in Digital Capabilities, Employability, Global and Cultural Capabilities, Resourcefulness and Resilience and Sustainability. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas:
Digital Capabilities: This foundational maths for data science module teaches students to use computer to solve mathematical problems. This involves learning to programming and learning to apply these skills to solve technical problems. Students also gain experience in programming with Python.
Employability: The ability to draw meaning from large data sets is currently an area that is in high demand in industry. This module teaches the mathematical foundations to allow students to work with large and complex real-world datasets. These skills are highly valuable to employers.
Global and Cultural Capabilities: Mathematics is a global language and the tools and languages used on this module can be used internationally. This module allows students to develop skills that will allow them to reason about and develop applications with global reach and collaborate with their peers around the world.
Resourcefulness and Resilience: This module involves practical problem-solving skills that teach a student how to work with complex and unstructured data sets. The foundational maths taught in this can be applied to a wide range of different scenarios, giving students new techniques for solving problems.
Sustainability: The data analysis skills learned in MATM063 equip students with the skills to analyse data on resource consumption, emissions, and environmental impact, facilitating the development of sustainable practices. Thus, this module plays a role in creating a more sustainable future.
Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2024/5 academic year.