Parameter-Efficient Fine-Tuning (PEFT)

Table of Contents

Parameter-Efficient Fine-Tuning (PEFT) is a family of techniques used to fine-tune large pre-trained models by updating only a small subset of parameters (or adding a small number of new trainable parameters), while keeping the vast majority of the original pre-trained weights frozen.

Core Problem Solved

Full Fine-Tuning of large models (e.g., 70B parameters) is extremely expensive:

Key Techniques

  1. Low-Rank Adaptation (LoRA): Injects small, trainable rank-decomposition matrices into linear layers.
  2. Adapters: Inserts small trainable neural network layers between existing frozen layers.
  3. Prompt Tuning: Adds trainable “virtual tokens” to the input prompt, leaving the model weights entirely untouched.
  4. Quantized Low-Rank Adaptation (QLoRA): Combines LoRA with aggressive quantization (4-bit) to further reduce memory usage.

Benefits

    Mike 3.0

    Send a message to start the chat!

    You can ask the bot anything about me and it will help to find the relevant information!

    Try asking: