Advanced Natural Language Processing
Full course description
For decades, teaching a computer to deal with natural language processing (NLP) was a long-time dream of humankind. Task such as machine translation, summarization, question-answering, speech recognition or chatting remained a challenge for computer program. Around 2020, major improvements were made. Starting with machine translation and ultimately in late 2022 with ChatGPT. Why were these large-language models suddenly so good? How did we get here? What can we do with these new algorithms to improve them even more?
This course will provide the skills and knowledge to understand and develop state-of-the-art (SOTA) solutions for these natural language processing (NLP) tasks. After a short introduction to traditional generative grammars and statistical approaches to NLP, the course will focus on deep learning techniques. We will discuss Transformers, variations on their architecture (including BERT and GPT) in depth, which models works best for which tasks, their capacities, limitations and how to optimize these.
Although that we have algorithms that can deal with Natural Language Processing in ways that can no longer be distinguished from humans, we still have some major problems to address: (i) we do not fully understand what these algorithms know and what they do not know. So, there is a strong need for eXplainable AI (XAI) in NLP. (ii) Training the deep-learning large language-models costs too much energy. We need to develop models that are less computationally (and thus energy) intensive. (iii) Now that these algorithms operate at human-level quality, several ethical problems arise related to computer generated fake-news, fake profiles, bias, and other abuse. But there are also ethical, legal, regulatory and privacy challenges. In this courses, these important topics will also be discussed.
This course is closely related with the course Information Retrieval and Text-Mining (IRTM). In this course the focus is more on advanced methods and architectures to deal with complex natural language tasks. The IRTM course focusses more on building search engines and text-analytics, but also uses a number of the architectures which are discussed in more depth in this course. The overlap between the two courses is kept to a minimum. There is no need to follow the courses in a specific order.
Prerequisites
None.
Recommended reading
Papers published in top international conferences and journals in machine learning field.