Resources on Parameter-efficient Fine-Tuning (PEFT) of Language Models

Blogs & review papers

https://lightning.ai/pages/community/article/understanding-llama-adapters/ (I highly recommend this blog post which has good graphics and code to explain the PEFT methods)
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

Papers of methods:

Soft prompting – Add dimensions at certain layers:
- Prompt Tuning, Lester et al., Google Research
- Prefix-tuning, Li and Liang, Stanford Prompt Tuning and Prefix Tuning are highly similar.
- P-tuning More complex than Prefix Tuning or Prompt Tuning. Soft prompting on every part in the prompt template.
- P-tuning v2 Adding soft prompts to each layer, instead of only the input one as in the above approaches
Reparametrization – kinda like adding dimensions at certain layers but much fewer parameters.
- LoRA
Adding layers
- Adapter: Parameter-Efficient Transfer Learning for NLP Use more fully-connected layers in a transformer unit.
Mixed:
- LLAMA-Adapter Both prefix tuning and adding layers (attention). They call the idea zero-init attention: Self-attention on the soft prompt + the attention weighted soft prompt gated + concatenated with the raw input.
- LLAMA-Adapter v2 Also finetunes the biases.

Library tutorials:

HuggingFace’s peft
LightingAI’s lit-gpt However, I would not recommend lit-gpt. peft seems more mature and modularized.