#atom

Eliciting step-by-step reasoning in language models

Core Idea: Chain-of-Thought prompting generates a sequence of intermediate reasoning steps before arriving at a final answer, significantly improving performance on complex reasoning tasks, especially with larger models.

Key Elements

Key Principles

  1. Step-by-Step Reasoning:

    • Breaking down complex problems into sequential reasoning steps
    • Creates explicit "working memory" within the generation context
    • Mimics human thinking aloud problem-solving approaches
  2. Model Size Dependence:

    • Benefits are more pronounced with larger models (>50B parameters)
    • Represents an emergent capability not explicitly trained for
    • Effectiveness scales with model size and quality
  3. Task Complexity:

    • Most effective for tasks requiring multi-step reasoning
    • Particularly valuable for mathematics, logic, and complex problem solving
    • Helps overcome working memory limitations in LLMs

Implementation Approaches

  1. Few-shot CoT:

    • Provide examples with detailed reasoning chains
    • Model learns to mimic the reasoning pattern
    • Typically 2-8 examples are sufficient for strong results
    • Example format: "Question... Let's think about this step by step. First... Next... Therefore..."
  2. Zero-shot CoT:

    • Use phrases like "Let's think step by step" to trigger reasoning
    • No examples needed, works through direct instruction
    • Surprisingly effective across many reasoning tasks
    • Less powerful than few-shot but more efficient to implement
  3. Step Separation:

    • Format using newlines between steps for better performance
    • Clear delineation between reasoning steps and final answer
    • Consistent formatting across examples

Advanced Techniques

  1. Self-consistency:

    • Generate multiple reasoning paths
    • Select most consistent answer across paths
    • Reduces impact of reasoning errors in any single path
  2. Verification Steps:

    • Encouraging models to verify intermediate conclusions
    • Explicit fact-checking of generated statements
    • Recognizing and correcting errors mid-reasoning
  3. Tree of Thoughts:

    • Explores multiple reasoning branches
    • Allows backtracking when paths lead to errors
    • More comprehensive exploration of solution space

Applications

Why It Matters

Example

Question: Marty has 100 centimeters of ribbon that he must cut into 4 equal parts. Each of the cut parts must be divided into 5 equal parts. How long will each final cut be?

Answer: Let's think step by step.

First, I need to find the length of each of the 4 equal parts.
100 centimeters ÷ 4 = 25 centimeters per part.

Next, each 25-centimeter part needs to be divided into 5 equal parts.
25 centimeters ÷ 5 = 5 centimeters per final cut.

Therefore, each final cut will be 5 centimeters long.

Connections

References

  1. Wei, J., et al. (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS.
  2. Kojima, T., et al. (2022). "Large Language Models are Zero-Shot Reasoners." NeurIPS.
  3. Wang, X., et al. (2023). "Self-Consistency Improves Chain of Thought Reasoning in Language Models." ICLR.
  4. Fu, Y., et al. "Complexity-based prompting for multi-step reasoning"

#ChainOfThought #CoT #Reasoning #StepByStep #ProblemSolving #MathematicalReasoning #PromptEngineering

Sources: