#atom

The refinement phase that transforms raw language models into helpful assistants

Core Idea: Post-training is the process of aligning pre-trained language models with human preferences through supervised fine-tuning and reinforcement learning, giving them their assistant-like persona and capabilities.

Key Elements

Supervised Fine-Tuning (SFT)

Reinforcement Learning from Human Feedback (RLHF)

Constitutional AI (CAI)

Thinking/Reasoning Enhancement

Connections

References

  1. Anthropic's Constitutional AI paper
  2. OpenAI's InstructGPT paper describing RLHF
  3. DeepSeek's research on incentivizing reasoning capabilities in LLMs

#LLM #post-training #RLHF #alignment #fine-tuning


Connections:


Sources: