Attention Mechanism
< GlossaryA key component of transformers that allows the model to focus on relevant parts of the input. Self-attention lets each token 'look at' all other tokens in the sequence.
A key component of transformers that allows the model to focus on relevant parts of the input. Self-attention lets each token 'look at' all other tokens in the sequence.