Relative Positional Embedding is a type of Positional Embedding where the emphasis is on the relative distance or relationship between tokens rather than their absolute positions in the sequence.
Key Concepts
- Mechanism: The model learns relationships based on how far apart tokens are (e.g., “5 tokens away”) instead of “Position 5”.
- Advantages:
- Generalization: Allows the model to generalize better to sequence lengths not seen during training. Even if the training sequences were short, the model can handle longer sequences because it relies on relative distances.
- Usage: Particularly useful for tasks involving very long sequences or where local context structure repeats.
