NATURAL LANGUAGE PROCESSING - 2024/5

Module code: COM3029

Module Overview

This module will demonstrate fundamental concepts from the field of Natural Language Processing (NLP) and Computational Linguistics. It will also discuss some of the latest advances in NLP and Generative Artificial Intelligence with a focus on Language Models like BERT, T5, and GPT, and get student up to speed with current research. It will provide the necessary skills to enable students to build computational models for solving a range of problems, such as text classification, sequence classification, machine translation and building conversation agents. The students will learn how to build NLP pipelines for preparing training data and choosing appropriate algorithms and techniques to build such models. The module also focuses on aspects of ethical and trustworthy artificial intelligence with discussion on rigorous model evaluation and ethical considerations for computational modeling. Although traditional linguistic approaches will be mentioned, majority emphasis will be put on the state-of-the-art Deep Learning algorithms and Transfer Learning methods for building efficient and trustworthy NLP solutions. 

Module provider

Computer Science and Electronic Eng

Module Leader

KANOJIA Diptesh (CS & EE)

Number of Credits: 15

ECTS Credits: 7.5

Framework: FHEQ Level 6

Module cap (Maximum number of students): N/A

Overall student workload

Independent Learning Hours: 86

Lecture Hours: 24

Laboratory Hours: 20

Guided Learning: 10

Captured Content: 10

Module Availability

Semester 2

Prerequisites / Co-requisites

None

Module content

Indicative content includes:


  • Introduction to NLP

  • Traditional linguistic processes and features

  • Data pre-processing and text analytics

  • Character/Sub-word/Word/Sentence Embeddings as features 

  • Neural Networks for NLP

    • RNN

    • LSTM/GRU 

    • CNN



  • Transformers architecture and its applications to Language Modeling

  • Language Models and NLP Tasks like question answering, machine translation, building conversational agents.

  • Model Training and NLP Pipelines

  • From Zero-shot to Supervised Fine Tuning for Transfer Learning

  • Model Evaluation and Deployment

  • Ethical Considerations for NLP


Assessment pattern

Assessment type Unit of assessment Weighting
Coursework GROUP PROJECT 50
Examination 2 HR INVIGILATED EXAMINATION 50

Alternative Assessment

Individual coursework covering the same learning objectives

Assessment Strategy

The assessment strategy is designed to provide students with the opportunity to demonstrate:

• ability to demonstrate learning of theoretical fundamental for language processing.

• ability to critically evaluate, and apply appropriate techniques to build computational models, exemplifying specific dataset characteristics in order to derive suitable and defensible results.

• ability understand the needs and build appropriate problem solving skills for a range of NLP problems

Thus, the summative assessment for this module consists of:


  • GROUP PROJECT: A group coursework in which the students will bring together their experiments from the first coursework and will build a proof of concept that will need to demonstrate both technical understanding and good practice. The coursework will be marked based on the approach and breadth of experimentation, rather than on the performance of the computational models. This addresses LOs 3, 4, 5, and 6. 

  • INVIGILATED EXAMINATION: An exam testing the understanding of theoretical concepts taught in the module. Analytical and critical thinking shall be assessed via this component. This addresses LOs 1, 2, 4, and 6. 



Formative assessment and feedback

Students will be guided to work on weekly tasks through lab exercises, the solutions to which will provide for feedback on understanding and practice.

Verbal feedback on student efforts alongside help in executing the lab tasks are provided to the student. 

Labs and feedback will then support the coursework. 

Module aims

  • Provide an overview of the fundamentals and concepts that support the development of NLP models.
  • Familiarise students with NLP applications and what approaches can be adopted to experiment and build such applications.
  • Demonstrate how NLP processing pipelines can be formed to perform necessary transformations towards building computational models.
  • Bring students up to a sufficient level of development skill to be able to develop NLP models that solve specific tasks.

Learning outcomes

Attributes Developed
001 Be able to describe the NLP process lifecycle and theoretical fundamentals of NLP. KC
002 Be able to demonstrate the ability to build such processes for solving specific NLP problems. CP
003 Build appropriate NLP transformation pipelines for training computational models KCPT
004 Be able to describe and deploy experiments, comparing the appropriate techniques and selecting appropriate algorithms for training NLP models. KCPT
005 Build experiment scripts using an appropriate programming language, and produce NLP models. CPT
006 Deploy NLP models as Web service inference endpoints and test the endpoints to consume those services. KCPT

Attributes Developed

C - Cognitive/analytical

K - Subject knowledge

T - Transferable skills

P - Professional/Practical skills

Methods of Teaching / Learning

The learning and teaching strategy is designed to:

• Develop a critical understanding of best practice in developing state-of-the-art NLP solutions through directed learning and facilitated self-directed learning. The skills learned in this module will be transferable to other data science and AI modules in the programme and the wider data science profession.

The learning and teaching methods include:

• Lectures with class discussion

• Lab sessions with formative feedback

• A learning component which focuses on a specific aspect of NLP and/or AI 

• Use of an online forum for facilitated discussion

Indicated Lecture Hours (which may also include seminars, tutorials, workshops and other contact time) are approximate and may include in-class tests where one or more of these are an assessment on the module. In-class tests are scheduled/organised separately to taught content and will be published on to student personal timetables, where they apply to taken modules, as soon as they are finalised by central administration. This will usually be after the initial publication of the teaching timetable for the relevant semester.

Reading list

https://readinglists.surrey.ac.uk
Upon accessing the reading list, please search for the module using the module code: COM3029

Other information

Digital Capabilities
The advanced AI skills taught in this module provide students digital skills that are fundamental to solving many computer science problems today. It teaches students techniques to use computers to identify patterns in large datasets and deploy solutions that will solve these problems in a practical way. These skills are highly valued in industry.

Employability
This module provides advanced AI, and computational model buidling skills that are important in solving a many real-life problems today. Students are equipped with practical problem-solving skills, theoretical skills, and mathematical and statistical skills, all of which are highly valuable to employers. Students learn to deploy their solution using industry standard tools used for Generative Artificial Intellignece while providing practical experience as well as the theoretical underpinnings.

Global and Cultural Skills
Computer Science is a global language and the tools and languages used on this module can be used internationally. This module allows students to develop skills that will allow them to reason about and develop applications with global reach and collaborate with their peers around the world.

Resourcefulness and Resilience
This module involves practical problem-solving skills that teach a student how to reason about and solve new unseen problems through combining the theory taught with practical technologies for systems that are in everyday use. Students learn to develop and deploy a practical solution to a complex problem.

Ethical Considerations for Computational Modeling
This module emphasizes on ethical considerations for computational modeling and aspects of reliable and trustworthy output from computational models. Students are encouraged to use and develop open-sourced, transparent and trustworthy NLP models with a focus on rigorous model evaluation. 

Programmes this module appears in

Programme Semester Classification Qualifying conditions
Computing and Information Technology BSc (Hons) 2 Optional A weighted aggregate mark of 40% is required to pass the module
Computer Science BSc (Hons) 2 Optional A weighted aggregate mark of 40% is required to pass the module
Computer Science MEng 2 Optional A weighted aggregate mark of 40% is required to pass the module

Please note that the information detailed within this record is accurate at the time of publishing and may be subject to change. This record contains information for the most up to date version of the programme / module for the 2024/5 academic year.