STRATIFIED MEDICINE AND BIOMEDICAL DATA ANALYSIS - 2023/4

Module code: BMSM037

Module Overview

The data produced in biomedical research can originate from many different approaches analyses. For example, DNA sequencing, measurement of messenger RNA levels, or measurement of protein levels. The modern techniques available for biomedical research mean that data acquired is huge and can be challenging to manipulate and analyse. Within this module we consider the sources of this data, means of manipulating the data, and its relevance to understanding health and disease; thus enhancing the students’ data sciences skills toolbox and digital capabilities. Precision medicine has been defined as the tailoring of medical treatment to the individual characteristics of each patient to classify individuals into subpopulations that differ in their susceptibility to a particular disease or their response to a specific treatment. The data derived using the methods described above enables patient stratification and precision medicine to be applied. The module will contextualise concepts in Big Data (previously covered by the Introduction to Big Data in Biomedicine and Health module) into the precision medicine landscape. The course will include lectures, as well as practical computational sessions covering the following topics to address the learning outcomes described below. These include (but not limited to):

Whole genome SNP (Single Nucleotide Polymorphism) arrays and whole genome sequencing

Techniques in transcriptomics, such as RNA-seq, and the use of such data

Mass Spectrometry (MS), to measure proteins and small molecules

Bioinformatics tools

Public repositories to browse, search, submit and retrieve biomedical data

Standards in bioinformatics

Biomedical data available in UK Biobank

Students will be working with the kind of huge datasets that are found in banks and supermarkets or manufacturers. In attending this module, they will learn about data manipulation and management. The use of UK Biobank and the biomedical data available within this resource by this module, dovetails and compliments other modules taught across the MSc which also incorporate analysis of UK Biobank data. Thus, there is a designed coherence across the course, and students will benefit from carrying detail from one module to the next, deepening their understanding of Big Data analysis and leveraging the many modalities of data in health and biomedicine.

Module provider

School of Health Sciences

Module Leader

WHETTON Tony (Biosciences)

Number of Credits: 15

ECTS Credits: 7.5

Framework: FHEQ Level 7

Module cap (Maximum number of students): 35

Overall student workload

Workshop Hours: 18

Independent Learning Hours: 80

Lecture Hours: 18

Seminar Hours: 8

Tutorial Hours: 6

Guided Learning: 15

Captured Content: 5

Module Availability

Semester 2

Prerequisites / Co-requisites

N/A

Module content

The module first introduces biological facts around the machinery of life (DNA makes RNA makes protein) and various sources of Big Data in this subject area. Since students do not necessarily have a working knowledge of the biology and basic processes found in a living cell, including concepts such as the DNA triplet codon, exons and introns, messenger RNA, transcription, and translation into protein, students will be provided with a reference to material covering basic biology; this includes articles, and online videos, which will be available for students to digest as needed. As usual in this course we will continue to use UK Biobank as an exemplar. It then discusses the ways to generate large datasets on biomolecules for biomedical research. Data extraction and information generation from these sources is then considered. The relevance of these measures to patient stratification and precision medicine is then discussed. These concepts then will be considered as the basis to pre-process, analyse and evaluate real-world omics data using various techniques. The learned concepts will be reinforced through lab sessions in omics data handing.

This module will cover the following topics and subjects:

1. Introduction to new approaches in precision medicine and biomarkers

a. Moving away from one size fits all medicine with big data to help us.

b. What kind of data is useful for patient stratification (with examples)

c. Disease examples and clinical question in respect of patient stratification

2. The central paradigm in living systems

a. DNA: its role, polymorphism, mutation and disease, treatments targeting DNA

b. DNA makes RNA: how and why

c. RNA encodes protein amino acid sequence

d. Protein amino acid sequence, form and function.

e. Protein post translational modification

f. Drug targets

3. Methods to measure key bio-macromolecules

a. DNA sequencing

b. RNA sequencing

c. Mass spectrometry (e.g. proteomics and metabolomics)

d. Protein sequencing

e. Clinical laboratory protein measurement

f. Issues in large cohorts

4. Data manipulation tools for genomics, transcriptomics, proteomics, and small molecules

a. Data sources and aggregation

b. DNA sequencing, genome wide association studies

c. Single nucleotide polymorphisms

d. RNA sequencing and RNA splicing data assessment and manipulation

e. Protein sequencing data set analyses

f. Protein relative quantification

5. Clinical relevance and application in precision medicine

a. Application of omics in healthcare

b. Precision medicine and cancer

c. Precision medicine and inflammatory disease

6. Open resources, value and usage

Workshops and practicals (with guided learning content prefacing each session):

1. Loading various types of omics data and understanding their differences

2. DNA sequence data manipulation

3. RNA data manipulation

4. Protein data manipulation

5. Multi-omics

Assessment pattern

Assessment type	Unit of assessment	Weighting
Coursework	Online quiz 1	10
Coursework	Online quiz 2	10
Coursework	Online quiz 3	10
Coursework	Power-point presentation	15
Project (Group/Individual/Dissertation)	Hands-on analysis assignment	15
Project (Group/Individual/Dissertation)	Essay (2000 words) from a list of essay titles covering course content	40

Alternative Assessment

To address the need for inclusive approaches that take into account students¿ familiarity with English, and also fear of public speaking, we will use PowerPoint slide shows that are turned in with pre-recorded voiceover (15%). We will also give students the opportunity to correct their language as compared to live presentations. Allowing students to give their presentation to classmates as a formative assessment will also be employed.

Assessment Strategy

The assessment strategy is designed to allow students to take the knowledge acquired in lectures and seminars plus all taught materials to address specific questions in an multiple-choice question (MCQ) format and in an essay. The practicals focus on use and implementation, and evaluation of a set of tools for real-world omics and health/disease data with the focus on the selection of appropriate on line tools for data analysis.

Thus, the summative assessment for this module consists of online quizzes, Power-point presentations, essays and reflections:

For the module, students will receive formative assessment/feedback in the following ways.

During lectures by question and answer sessions

Reflection on research seminars guided by lecturers

By means of workshop problem sheets

During supervised practical sessions

Via feedback comments on assessed coursework and abstracts of assessed coursework or draft powerpoint presentations.

Summative assessments are given below; these are linked through the common strand of the kind of data and the specific data used for the student's work.

Three online quizzes (each accounting for 10% of the final grade): Students will complete three short online quizzes over the course duration. These will be multiple choice, true /false, and matching questions. These will give students the opportunity to demonstrate their understanding of the topics, retainment of knowledge, and through provided feedback, gauge on how they are progressing. They will be preceded by formative MCQs in the class as formative assessment (addressing learning outcomes 1, 2, 3, 4, 5, and 6)

Students will submit Power-point presentation (with voiceover, 15%, addressing learning outcomes 3, 4, 5, 6, 7).

Students will work together in small groups to analyse an online dataset (15%; addressing learning outcomes 1, 2, 3, 5).

Literature review: Students will choose from a range of innovations within precision medicine and write a 2000-word literature review using original peer reviewed journal articles which will be submitted and assessed on line (40% of final grade, addressing learning outcomes 3, 4, 5, 6, and 7). A formative assessment will be the submission of the journal articles to be used and 200 words of text describing the main focus of the review.

Module aims

The aim of this module is for students to understand ¿state-of-the-art¿ omics approaches that generate data for bioinformatics and health informatics usage. A further aim is to demonstrate how omics technologies can be applied to address questions in precision medicine.

Students will gain theoretical knowledge of various omics approaches and technologies, including genomics, transcriptomics and proteomics. Through research-led seminars from leading experts they will develop an understanding of how omics technologies are applied to specific areas of research. Students will develop the practical skills required to analyse omics data in a series of hands-on computer-based workshops. This module will provide the students with knowledge and experience of the multiple steps (e.g. experimental step, data analysis, and interpretation) involved in using omics platforms in healthcare and biomedical research.

Learning outcomes

		Attributes Developed
001	Use and understand bioinformatics tools to analyse DNA, RNA, and proteomics data, involving identification and quantification approaches, as well as downstream analyses such as functional annotation	KCP
002	Browse, search, submit and retrieve DNA, RNA and proteomics data from widely used public repositories	KCPT
003	Discuss standards in genomic, transcriptomic and proteomic bioinformatics and recognise its importance	KCPT
004	Evaluate the strengths and weaknesses of several experimental and omics bioinformatics analysis approaches	KCPT
005	Relate specific diseases to DNA-, RNA- and protein-based molecular pathologies inclusive of evaluation of mutations and splice variants in DNA, RNA and protein	CP
006	Discuss precision medicine approaches and omics support for developments in this area.	KCPT
007	Effective communication within the discipline demonstrated through an ability to produce written scientific summaries, critically appraising relevant literature, create presentation materials, and give oral presentations	CT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

Methods of Teaching / Learning

The learning and teaching strategy is designed to inform students of the major opportunities that lie in analysis of huge epidemiological studies such as All of Us and UK Biobank; while also providing hands-on experience in using such data. These huge studies are complemented by very many case-control studies and smaller epidemiological studies. In these there are rich sets of omics data that offer major opportunities to understand human health and disease. Thus, this module aims to make students aware of the opportunities available via analysing acquired, and available datasets, and then give them hands-on training in the tools available to access and use the datasets (be they genomic, proteomic, transcriptomic or other data). These skills are highly sought after, both within academia, as well as in industry, thus opening a range of employment opportunities for our students. Further, the link this module makes between biomedical data and health-related research and applications, in one that will equip students with a unique perspective. The student will then demonstrate understanding with written reports on analysis and findings. To understand these complex datasets students will be provided with case studies and examples. RNA data is different to DNA data and so different teachers will be employed for each type of dataset and omic type. Then, students will bring the use of these datasets together via common statistical and data manipulation requirements, practiced in the practical sessions. Multiple-choice questions will be used to enable students to understand their comprehension levels in respect of teaching from a number of academic staff in a number of subdiscipline areas.

Formative assessment goes hand in hand with summative assessment throughout this process.

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.

Reading list

https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: BMSM037

Other information

The MSc Health and Biomedical Informatics programme is committed to developing graduates with strengths in Employability, Digital Capabilities, Global and Cultural Capabilities, Sustainability, and Resourcefulness and Resilience. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas:

Sustainability: The module described here introduce students to modern day biomedical resources that help deliver new approaches to health and wellbeing. These resources are helping in making health and biomedical research more sustainable, by allowing for effective data gathering and usage without re-performing data gathering exercises unnecessarily. Re-use and appropriate use of datasets decreases costs and carbon footprint associated with new experiments using DNA sequencers, mass spectrometers and other equipment.
Digital Capabilities: This module will provide students with new digital capabilities through the development of data manipulation capabilities and data presentation skills in biomedicine. These digital capabilities build on, and are complimentary to those digital capabilities obtained through other modules in this course, as these specifically address the handling, manipulation, and analysis, of omic data.
Resourcefulness and Resilience: The module will also provide students with new experiences and abilities in critical thinking and appraisal, applied to another dimension of health and biomedical informatics, that will enhance their confidence. Through the development of skills in data analysis through hands-on computer-based workshop students will also enhance their resourcefulness in a gained ability to tackle a range of data modalities. Teamworking is essential in this field and transferable skills are engendered within the student’s progression during the course of the work described.
Employability: The knowledge gained through this module as well as the skillsets gained through workshops and assessments will be of value when students progress to their next stage of their careers in data sciences and informatics. These are skills that are highly sought after by a range of employers, from academic institution to pharmaceutical companies and other biomedical and high-tech industries.

Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2023/4 academic year.