Context Vector

A Context Vector is the output of the Attention Mechanism. Unlike an input embedding vector, which only captures the semantic meaning of a single token in isolation, a context vector is “enriched” because it incorporates information about how that token relates to all other tokens in the input sequence.

Calculation

The context vector (ZZ) is calculated as the weighted sum of Value vectors (VV), weighted by the Attention Weights (AA).

Z=A×VZ = A \times V

For example, in the sentence “Your journey starts with one step,” the context vector for “journey” is the weighted sum of the value vectors for all tokens in the sequence (“Your”, “starts”, “with”, “one”, “step”).

If “journey” pays high attention to “starts” and “step,” their value vectors will contribute significantly to the final context vector ZZ, enriching “journey” with the specific context of beginning a process.

    Mike 3.0

    Send a message to start the chat!

    You can ask the bot anything about me and it will help to find the relevant information!

    Try asking: