Resources on Parameter-efficient Fine-Tuning (PEFT) of Language Models
Blogs & review papers
- https://lightning.ai/pages/community/article/understanding-llama-adapters/ (I highly recommend this blog post which has good graphics and code to explain the PEFT methods)
- Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
Papers of methods:
- Soft prompting – Add dimensions at certain layers:
- Prompt Tuning, Lester et al., Google Research
- Prefix-tuning, Li and Liang, Stanford Prompt Tuning and Prefix Tuning are highly similar.
- P-tuning More complex than Prefix Tuning or Prompt Tuning. Soft prompting on every part in the prompt template.
- P-tuning v2 Adding soft prompts to each layer, instead of only the input one as in the above approaches
- Reparametrization – kinda like adding dimensions at certain layers but much fewer parameters.
- Adding layers
- Adapter: Parameter-Efficient Transfer Learning for NLP Use more fully-connected layers in a transformer unit.
- Mixed:
- LLAMA-Adapter Both prefix tuning and adding layers (attention). They call the idea zero-init attention: Self-attention on the soft prompt + the attention weighted soft prompt gated + concatenated with the raw input.
- LLAMA-Adapter v2 Also finetunes the biases.
Library tutorials:
- HuggingFace’s
peft
- LightingAI’s
lit-gpt
However, I would not recommendlit-gpt
.peft
seems more mature and modularized.