Instruction Finetuning Survey

1/ What is instruction tuning

Instruction tuning refers to the process of further training LLMs on a dataset consisting of (Instruction, Output) pairs in a supervised fashion.

2/ Why need instruction tuning

Problem: mismatch between the training objective and user’s objective. LLMs trained on predict next token task while users want the model to follow their instructions
The benefits of instruction tuning
- IT bridges the gap between the next-word prediction objective of LLMs and the user’s objective of instruction following.
- IT allows for a more controllable and predictable model behavior compared to standard LLMs. The instructions serve to constrain the model’s outputs to align with the desired response characteristics or domain knowledge.
- IT is computationally efficient and can help LLMs rapidly adapt to a specific domain without extensive retraining or architectural changes.
The challenges of instruction tuning
- Crafting high-quality instructions that properly cover the desired target behaviors is non-trivial: existing instruction datasets are usually limited in quantity, diversity, and creativity
- there has been an increasing concern that IT only improves on tasks that are heavily supported in the IT training dataset
- there has been an intense criticism that IT only captures surface-level patterns and styles (e.g., the output format) rather than comprehending and learning the task

3/ Outline

Section 2: general methodology employed in instruction finetuning
Section 3: construction process of IT datasets
Section 4: instruction-finetuned models
Section 5: multi-modality techniques and datasets for instruction tuning (images, speech, video)
Section 6: Adapts LLMs to different domains and applications using IT strategy
Section 7: Finetuning more efficient, reducing computational and time costs
Section 8: evaluation of IT models
Section 2: Methodology
Section 3: Datasets