Next Word Prediction is the fundamental task that LLMs are trained on. It involves predicting the most likely next token in a sequence given the previous context.
Key Concepts
- Training Objective: LLMs are primarily trained to predict the next word in a sequence.
- Emergent Properties: Despite being trained on this simple objective, models develop Emergent Behavior such as translation, summarization, and reasoning capabilities.
- Data Batching: For efficient training, data is fed in batches where the model learns to predict the next output based on a given context.
