Skip to content

Commit

Permalink
update chapter 02
Browse files Browse the repository at this point in the history
  • Loading branch information
MikeySaw committed Apr 23, 2024
1 parent e5f528f commit d46dc12
Show file tree
Hide file tree
Showing 6 changed files with 32 additions and 7 deletions.
2 changes: 1 addition & 1 deletion content/chapters/01_introduction/_index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 1: Introduction to the course"
---
In this chapter, you'll dive into the fundamental principles of Deep Learning for Natural Language Processing (NLP). Explore key concepts including learning paradigms, various tasks within NLP, the neural probabilistic language model, and the significance of embeddings.
In this chapter, you will dive into the fundamental principles of Deep Learning for Natural Language Processing (NLP). Explore key concepts including learning paradigms, various tasks within NLP, the neural probabilistic language model, and the significance of embeddings.

<!--more-->
6 changes: 5 additions & 1 deletion content/chapters/02_dl_basics/02_01_rnn.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,15 @@ title: "Chapter 02.01: Recurrent Neural Networks"
weight: 2001

---
This chapter introduces Recurrent Neural Networks in the context of Language Modelling and discusses different types of RNNs, such as LSTMs and Bidirectional RNNs.
Conventional Feed Forward Neural Networks don’t allow us to process sequential data, which is why we need Recurrent Neural Networks (RNNs) to handle text data. In this chapter we also get to know models that help us to overcome the limits of simple RNNs. You will learn about LSTMs and Bidirectional RNNs. LSTMs incorporate different gates that control the information flow of the network and allow us to model long term dependencies. Bidirectional RNNs introduce bidirectionality into the model, which allows it to not only learn from the left side but also from the right side context of sequential data

<!--more-->

### Lecture slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter02-deeplearningbasics/slides-21-rnn.pdf" >}}

### Additional Resources

- [Video explaining LSTM](https://www.youtube.com/watch?v=YCzL96nL7j0)

8 changes: 6 additions & 2 deletions content/chapters/02_dl_basics/02_02_attention.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,14 @@ title: "Chapter 02.02 Attention"
weight: 2002
---

This chapter provides a first introduction to the Attention mechanism as a way to model long range dependencies.
This chapter will give you a first introduction into the concept of Attention, as introduced in [1]. Attention mechanisms allow neural networks to focus on specific parts of the input sequence, assigning varying degrees of importance to different elements, enhancing performance especially in tasks where long-range dependencies are crucial, overcoming limitations of LSTMs and vanilla bidirectional RNNs which struggle with retaining information across long sequences or capturing complex relationships between distant elements. This is achieved by dynamically weighting the importance of different parts of the input sequence during computation, enabling the model to attend to relevant information and effectively process inputs of varying lengths.

<!--more-->

### Lecture slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter02-deeplearningbasics/slides-22-attention.pdf" >}}
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter02-deeplearningbasics/slides-22-attention.pdf" >}}

### References

- [1] [Bahdanau et al., 2014](https://arxiv.org/abs/1409.0473)
10 changes: 9 additions & 1 deletion content/chapters/02_dl_basics/02_03_elmo.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,19 @@
title: "Chapter 02.03: ELMo"
weight: 2003
---
In this chapter we introduce ELMo, a modelling approach, that enables us to contextualize word embeddings.
Here you will learn about ELMo (Embeddings from Language Models) [1], which is a deep contextualized word representation model that generates word embeddings by considering the entire input sentence, capturing complex linguistic features and contextual nuances. It accomplishes this by using a bidirectional LSTM (Long Short-Term Memory) network to generate contextualized word representations, where each word's embedding is dynamically influenced by its surrounding context. This enables ELMo embeddings to capture polysemy, syntactic variations, and semantic nuances that traditional word embeddings like Word2vec or FastText may miss.

<!--more-->

### Lecture slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter02-deeplearningbasics/slides-23-elmo.pdf" >}}

### References

- [1] [Peters et al., 2018](https://arxiv.org/abs/1802.05365)

### Additional Resources

- [ELMo Blogpost](https://sh-tsang.medium.com/review-elmo-deep-contextualized-word-representations-8eb1e58cd25c)

11 changes: 10 additions & 1 deletion content/chapters/02_dl_basics/02_04_tokenization.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,19 @@
title: "Chapter 02.04 Revisiting words: Tokenization"
weight: 2004
---
In order to feed text data into a model we have to tokenize it first. This chapter discusses various types of text tokenization.
This chapter is about Tokenization, which is the process of breaking down a sequence of text into smaller, meaningful units, such as words or subwords, to facilitate natural language processing tasks. Various tokenization methods exist, including Byte Pair Encoding (BPE) or WordPiece, each with its own approach to dividing text into tokens. BPE and WordPiece are subword tokenization techniques that iteratively merge frequent character sequences to form larger units, effectively capturing both common words and rare morphological variations.

<!--more-->

### Lecture slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter02-deeplearningbasics/slides-24-tokenization.pdf" >}}

### References

- [1] [Sennrich et al., 2015](https://arxiv.org/abs/1508.07909)
- [2] [Wu et al., 2016](https://arxiv.org/pdf/1609.08144v2.pdf)

### Additional Resources

- [Huggingface WordPiece Tutorial](https://huggingface.co/learn/nlp-course/chapter6/6)
2 changes: 1 addition & 1 deletion content/chapters/02_dl_basics/_index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 2: Deep Learning Basics"
---
This chapter gives a quick introduction to the basic concepts of deep learning in the context of NLP, such as RNN, attention, ELMo and tokenization.
In this chapter we explore fundamental concepts like Recurrent Neural Networks (RNNs), the attention mechanism, ELMo embeddings, and tokenization. Each concept serves as a building block in understanding how neural networks can comprehend and generate human language.

<!--more-->

0 comments on commit d46dc12

Please sign in to comment.