📝 Blog

Thoughts, experiments, and insights across RLHF, LLMs, Embodied AI, and beyond.

Tags:LLMsCoT DistillationModel CompressionRLHFPhilosophyAI EthicsEmbodied AINeuroscienceRL

Test1

2025-02-28

How granularity, supervision format, and teacher models affect CoT distillation into smaller LMs across 7 benchmarks.

2024-03-12

Reinforcement Learning from Human Feedback isn’t just a technical optimization, but a philosophical imperative in agent design.

2024-01-25

Why grounding intelligence in a physical world is essential for the next generation of agents.