1/ What is instruction tuning
Instruction tuning refers to the process of further training LLMs on a dataset consisting of (Instruction, Output) pairs in a supervised fashion.
2/ Why need instruction tuning
- Problem: mismatch between the training objective and user’s objective. LLMs trained on predict next token task while users want the model to follow their instructions
- The benefits of instruction tuning
- IT bridges the gap between the next-word prediction objective of LLMs and the user’s objective of instruction following.
- IT allows for a more controllable and predictable model behavior compared to standard LLMs. The instructions serve to constrain the model’s outputs to align with the desired response characteristics or domain knowledge.
- IT is computationally efficient and can help LLMs rapidly adapt to a specific domain without extensive retraining or architectural changes.
- The challenges of instruction tuning
- Crafting high-quality instructions that properly cover the desired target behaviors is non-trivial: existing instruction datasets are usually limited in quantity, diversity, and creativity
- there has been an increasing concern that IT only improves on tasks that are heavily supported in the IT training dataset
- there has been an intense criticism that IT only captures surface-level patterns and styles (e.g., the output format) rather than comprehending and learning the task
3/ Outline
-
Section 2: general methodology employed in instruction finetuning
-
Section 3: construction process of IT datasets
-
Section 4: instruction-finetuned models
-
Section 5: multi-modality techniques and datasets for instruction tuning (images, speech, video)
-
Section 6: Adapts LLMs to different domains and applications using IT strategy
-
Section 7: Finetuning more efficient, reducing computational and time costs
-
Section 8: evaluation of IT models
-
Section 2: Methodology
-
Section 3: Datasets