Test1

2025-02-28

🧩 Background

Chain-of-Thought (CoT) prompting enables powerful multi-step reasoning in LLMs. But transferring this ability into small language models (SLMs) remains a major challenge.


πŸ§ͺ Experiment Design

We study three axes:

  1. Granularity of intermediate steps
  2. Supervision format: explanation-only vs answer+reasoning
  3. Teacher model: GPT-3.5 vs GPT-4

7 datasets are used: GSM8K, SVAMP, DROP, etc.


πŸ“ˆ Key Findings

  • Fine-grained step supervision leads to more generalizable reasoning
  • Answer-only distillation fails to transfer reasoning skills
  • GPT-4 outputs yield more structured logic chains than GPT-3.5

πŸ’‘ Takeaway

Distillation is not copying β€” it’s translation.

β€œThe way we teach reasoning determines what reasoning emerges.”