From 01e955c4afa29521148ac8c0b5b11bc7c0495155 Mon Sep 17 00:00:00 2001 From: Ivan Ukhov Date: Mon, 15 Jan 2024 16:10:02 +0100 Subject: [PATCH] Start an article on relative positional embedding --- ...024-02-01-relative-positional-embedding.md | 27 +++++++++++++++++++ 1 file changed, 27 insertions(+) create mode 100644 _drafts/2024-02-01-relative-positional-embedding.md diff --git a/_drafts/2024-02-01-relative-positional-embedding.md b/_drafts/2024-02-01-relative-positional-embedding.md new file mode 100644 index 0000000..cc1b467 --- /dev/null +++ b/_drafts/2024-02-01-relative-positional-embedding.md @@ -0,0 +1,27 @@ +--- +layout: post +title: Relative positional embedding for any attention mechanism +date: 2024-02-01T08:00:00+01:00 +math: true +keywords: + - large language models + - machine learning + - positional embedding + - transformers +--- + +In [Shaw et al. (2018)], the authors introduce relative positional embedding for +self-attention in transformer models, and in [Huang et al. (2018)], the authors +present an efficient way of calculation this embedding in decoder blocks, in +which the self-attention is causal. In this article, the approach is generalized +to any attention mechanism, should it be self or cross or full or causal. + +# References + +* Huang et al., “[Music transformer: Generating music with long-term + structure][Huang et al. (2018)],” Google Brain, 2018. +* Shaw et al., “[Self-attention with relative position representations][Shaw et + al. (2018)],” Google Brain, 2018. + +[Huang et al. (2018)]: https://arxiv.org/abs/1809.04281 +[Shaw et al. (2018)]: https://arxiv.org/abs/1803.02155