BIG DATA IN BIOMEDICINE AND HEALTH - 2025/6

Module code: BMSM034

Module Overview

The availability of large-scale public biomedical, health, and clinical data from a variety of open sources can be leveraged for advancements in research, innovation, and improvement of healthcare. For example, the power of Artificial Intelligence lies in Big Data to a great extent, and this is very much the case in the health and biomedical domains. While there is a significant growth in available large-scale patient-level, and molecular data, as well as other social and environment information, these resources are rarely integrated and thoroughly mined to maximize gained knowledge. Developing the skillsets that can bring together data from disparate sources will serve our students in becoming future leading data scientists and informaticians, by enabling integrative, large-scale analyses that can produce novel insights and high impact outputs.

 

This introductory module will provide students with an overview of the key concepts related to Big Data, what it is, how it’s generated in biomedicine, how it’s stored, how to handle it, how to link it, the issues that come with analysing big data, and the benefits that Big Data may bring to research, discovery and innovation in health and biomedicine. The module will further introduce students to a range of open Big Data resources in the field; including how to access these, and the potential uses for these datasets.

 

Primarily, the module will focus on an introduction to the UK Biobank as one such Big Data resource. This will provide the background, knowledge and understanding that will feed into all other modules in the MSc course.

 

One of the most extensive resources in health, wellness, and lifestyle is the UK Biobank. UK Biobank recruited 503,000 volunteers aged 40-69 in the years 2006-10. These volunteers completed (and continue to complete further) detailed questionnaires on their health and lifestyle (e.g. food preferences). At the same time, physical measurement data were collected and a number of biological samples were obtained (e.g. plasma, urine, saliva). All of the participants have been genotyped and they will all have whole-genome sequencing available in 2023. Extensive biochemical analysis of blood and other samples will be an ongoing process over the coming years. A further development is the acquisition of multi-modal imaging involving brain, cardiac and body magnetic resonance imaging (MRI), dual-energy x-ray absorptiometry (DXA)

and carotid ultrasound, in approximately 100,000 participants. This imaging data adds to longitudinal follow-up data from web-based questionnaires and access to Hospital Admission Statistics (HES) data, cancer registries, and primary health care records of UK Biobank volunteers. UK Biobank is acquiring data on the plasma metabolome and proteome in blood. This huge amount of clinical, genomic, genotypic, proteomic, metabolome and questionnaire responses, is a resource from which much can be learned about Big Data, and how this can be used for research and innovation in disease risk, environment and health, biomarkers for wellness, diet and disease, and many other areas.

Module provider

School of Health Sciences

Module Leader

GEIFMAN Nophar (Health Sci.)

Number of Credits: 15

ECTS Credits: 7.5

Framework: FHEQ Level 7

Module cap (Maximum number of students): 35

Overall student workload

Workshop Hours: 12

Independent Learning Hours: 100

Lecture Hours: 18

Tutorial Hours: 6

Guided Learning: 10

Captured Content: 4

Module Availability

Semester 1

Prerequisites / Co-requisites

None

Module content

This module (and the MSc course as a whole) is structured around existing large data resources, giving our students the unique opportunity to build a portfolio of experiences in real world data handling, analysis, and interrogation.

 

It is a requirement to understand from whence data emanated in order to use it effectively. Thus, within this module we will teach the nature of Big Data in health and biomedicine, such as that generated by ‘omics techniques and electronic health records, followed by detail on how the data can be safely accessed, and used correctly and appropriately. This will directly feed into learning across following modules, including practical analysis sessions and module assessments on these data, as well as the student dissertations.

 

Through a series of lectures and workshops, the module will cover the following topics:


  • What is big data?

  • What are the origins of Big Data in biomedicine?

  • The benefits of Big Data in biology and medicine

  • Challenges with working with big data

  • Existing open resources, and how to access these

  • Introduction to UK Biobank

  • How the UK Biobank is being used – key cases studies



 

The UK Biobank Research Analysis Platform (UK-RAP) enables researchers working with UK Biobank's large-scale health and biomedical database and research resource, to access it in the cloud from anywhere in the world. As part of this module and through a dedicated workshop, students will gain experience in using the RAP, which will aid in any future use of UK Biobank data. Other workshops will include how to generate an impactful graphical infographic for effective dissemination of information.

Assessment pattern

Assessment type Unit of assessment Weighting
Coursework Infographic 30
Project (Group/Individual/Dissertation) Critical appraisal 70

Alternative Assessment

Students will be able to present their work in class, or alternatively submit a video of their presentation together with written reports.

Assessment Strategy

The assessment strategy is designed to allow students to demonstrate their gained ability to critically evaluate a Big Data resource, by analysing the strengths and limitations of the resource, as well as the potential uses and insight that such a resource can enable. Further, through the assessment, students will gain experience in generating a graphical summary of their analysis through developing their own infographic.

 

Thus, the summative assessment for this module consists of:


  • Developing an infographic depicting the core elements of a chosen Big Data resource

  • A critical appraisal of the selected data resource; this should briefly describe the resource, and how does the resource fit within the wider health and biomedical data landscape. The critical appraisal will include a SWOT analysis of the resource, addressing the key strengths and limitations, as well as suggesting potential future developments for the resource.



 

The infographic which will be presented at a face-to-face workshop, as well as submitted online, will form 30% of the final grade (addressing learning outcomes 1, 2, 5, 6 and 7). The critical appraisal will form 70% of the final grade (addressing learning outcomes 1, 2, 3, 4, 5, 6, and 7).

 

Students will be given verbal feedback on their infographic, both from the lecturer, as well as from peers through group discussion as part of the workshop. Prior to submission of their critical appraisal, students will submit a table-format SWOT outline as formative assessment. This will direct the writing of the critical appraisal by identifying the key themes to be covered. Feedback on this will help ensure students are on the right track for writing of the final assessment.

 

Written feedback will be given on the final written assessment (the critical appraisal).

Module aims

  • Introduce the key concepts and issues around Big Data
  • Explore the health data landscape, and the role of Big Data in this
  • Examine the role that stakeholders, including patients, play in creation and usage of Big Data in health and biomedicine
  • Introduce existing and successful open data resources for health and biomedical research
  • Introduce the UK Biobank as a unique, cutting-edge and world-leading data resource
  • Demonstrate practical and research uses for UK Biobank and other big data resources

Learning outcomes

Attributes Developed
001 Demonstrate familiarisation with a range of health and biomedical data resources K
002 Understanding of the benefits and limitations of existing big data resources K
003 Discuss the health and biomedical data landscape and the pathway from data collection, interpretation, analysis, visualization and decision-making KC
004 Discuss and understand concepts in database construction and modelling KT
005 Critically review a data resource and make suggestions for improvement CP
006 Produce visual and written reports to communicate the design and use of a data resource CPT
007 Work though the problem-solving cycle CPT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

Methods of Teaching / Learning

The learning and teaching strategy is designed to provide students with an overview of the current Big Data landscape in the field, and an understanding of the issues concerning this. This module will be delivered in a blended format: online learning preparation material will impart basic and core knowledge whilst the face-to-face lectures and workshops will formalise the content and encourage attendees to draw upon their own reading and experience.

 

This module will introduce students to global resources that help shape how we deliver a beneficial impact to society. Through lectures and workshops, students will become acquainted with leading Big Data resources and how these are being used. These resources are helping in making health and biomedical research more sustainable, by allowing for effective data gathering and usage. However, ethical issues, such as bias in data, and the implications of that, also need to be considered. Throughout the module students will be asked to reflect on these aspects and consider these within their own analyses and assessments.

The module will provide students with new experiences and abilities in critical thinking and appraisal that will enhance their confidence, as well as resourcefulness. The knowledge gained through this module as well as the skillsets gained through workshops and assessments will be of value when students progress to their next stage of their careers in data sciences and informatics. Further the module will include talks and guest speakers who will also touch on potential career avenues in the areas of Big Data and biomedicine and health.

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.

Reading list

https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: BMSM034

Other information

The MSc Health and Biomedical Informatics is committed to developing graduates with strengths in Employability, Digital Capabilities, Global and Cultural Capabilities, Sustainability, and Resourcefulness and Resilience. This module is designed to allow students to develop knowledge, skills, and capabilities in the following areas:

  • Digital Capabilities: This module will further develop the students’ digital capabilities through a gained understanding of how data is generated, stored, and handled, to support research and efficient systems. Students will also be introduced to a number of Big Data resources in health/biomedicine, gaining hands-on experience in gaining access though the use of data portals. Further, digital capabilities will be enhanced through the development of a graphical (digital) presentation of knowledge summary, as part of the assessment. 
  • Resourcefulness and Resilience: Small group work within structured workshops will provide students with the opportunity to work collaboratively through the problem-solving cycle, as well as provide peer-to-peer feedback. This, as well as new skills gained in graphical and written presentations of ideas and analyses will enhance the student’s resourcefulness, confidence and self-assurance.
  • Global and Cultural Capabilities: While Big Data has many advantages and potential for benefit to society, there are also many pitfalls that need consideration. Data may be biased, and could therefore lead to bias in any results obtained from the analysis of these data. This module will introduce students to global resources that help shape how we deliver a beneficial impact to society. Students will reflect on cultural and ethical aspects that relate to the collection and use of Big Data, and how this is being applied in health and Biomedicine.
  • Employability: The knowledge gained through this module as well as the skillsets gained through workshops and assessments, in particular in scientific presentation, will provide students with experience that will directly benefit their future careers. Familiarisation with existing Big Data resources, and a founded understanding of how these can and should be used are essential for students to take the next steps in the areas of health and biomedical data sciences and informatics. Guest lectures will also introduce to students to relevant areas for career development and progression.
  • Sustainability: The Big Data resources that will be covered throughout the module are helping in making health and biomedical research more sustainable, by allowing for effective data gathering and usage. Big Data resources, and in particular those that are publicly available, allow for sustainable re-use of data for a range of purposes, reducing the need for further animal experimentation, lab work, and data collection.

Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2025/6 academic year.