From 01e955c4afa29521148ac8c0b5b11bc7c0495155 Mon Sep 17 00:00:00 2001
From: Ivan Ukhov <ivan.ukhov@gmail.com>
Date: Mon, 15 Jan 2024 16:10:02 +0100
Subject: [PATCH] Start an article on relative positional embedding

---
 ...024-02-01-relative-positional-embedding.md | 27 +++++++++++++++++++
 1 file changed, 27 insertions(+)
 create mode 100644 _drafts/2024-02-01-relative-positional-embedding.md

diff --git a/_drafts/2024-02-01-relative-positional-embedding.md b/_drafts/2024-02-01-relative-positional-embedding.md
new file mode 100644
index 0000000..cc1b467
--- /dev/null
+++ b/_drafts/2024-02-01-relative-positional-embedding.md
@@ -0,0 +1,27 @@
+---
+layout: post
+title: Relative positional embedding for any attention mechanism
+date: 2024-02-01T08:00:00+01:00
+math: true
+keywords:
+  - large language models
+  - machine learning
+  - positional embedding
+  - transformers
+---
+
+In [Shaw et al. (2018)], the authors introduce relative positional embedding for
+self-attention in transformer models, and in [Huang et al. (2018)], the authors
+present an efficient way of calculation this embedding in decoder blocks, in
+which the self-attention is causal. In this article, the approach is generalized
+to any attention mechanism, should it be self or cross or full or causal.
+
+# References
+
+* Huang et al., “[Music transformer: Generating music with long-term
+  structure][Huang et al. (2018)],” Google Brain, 2018.
+* Shaw et al., “[Self-attention with relative position representations][Shaw et
+  al. (2018)],” Google Brain, 2018.
+
+[Huang et al. (2018)]: https://arxiv.org/abs/1809.04281
+[Shaw et al. (2018)]: https://arxiv.org/abs/1803.02155