-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
27 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,15 @@ | ||
--- | ||
title: "Chapter 3: Transformer" | ||
--- | ||
This chapter will introduce the Transformer architecture as introduced in [1]. We explore the different parts of the transformer model (Encoder and Decoder) and discuss ways to improve the architecture, such as Transformer-XL and Efficient Transformers. | ||
The Transformer, as introduced in [1], is a deep learning model architecture specifically designed for sequence-to-sequence tasks in natural language processing. It revolutionizes NLP by replacing recurrent layers with self-attention mechanisms, enabling it to process entire sequences in parallel, overcoming the limitations of sequential processing in traditional RNN-based models like LSTMs. This architecture has become the foundation for state-of-the-art models in various NLP tasks such as machine translation, text summarization, and language understanding. In this chapter we first introduce the transformer, explore different parts of it (Encoder and Decoder) and finally discuss ways to improve the architecture, such as Transformer-XL and Efficient Transformers. | ||
|
||
<!--more--> | ||
|
||
### References | ||
|
||
- [1] [Vaswani et al., 2017](https://arxiv.org/abs/1706.03762) | ||
|
||
### Additional Resources | ||
|
||
- [Very good video explaining the Transformer and Attention](https://www.youtube.com/watch?v=bCz4OMemCcA&t) | ||
- [3Blue1Brown Videoseries about the Transformer](https://www.youtube.com/watch?v=wjZofJX0v4M&t) |