Next Word Prediction

Next Word Prediction is the fundamental task that LLMs are trained on. It involves predicting the most likely next token in a sequence given the previous context.

Key Concepts

Training Objective: LLMs are primarily trained to predict the next word in a sequence.
Emergent Properties: Despite being trained on this simple objective, models develop Emergent Behavior such as translation, summarization, and reasoning capabilities.
Data Batching: For efficient training, data is fed in batches where the model learns to predict the next output based on a given context.

Key Concepts

Chat with Mike 3.0