Audio Edition: How Distillation Makes AI Models Smaller and Cheaper
5/14/20268 min
Fundamental technique lets researchers use a big, expensive “teacher” model to train a “student” model for less.
The story How Distillation Makes AI Models Smaller and Cheaper first appeared on Quanta Magazine.
Transcript preview
First 90 secondsSusan Valot· Host0:00
[upbeat music] Welcome to the Quanta Audio Edition. In each of these biweekly episodes, we bring you a story direct from the Quanta website about developments in basic science and mathematics. I'm Susan Valot. The Chinese AI company DeepSeek released a chatbot last year called R1, which drew a huge amount of attention. Much of the news coverage implied that the company discovered a new, more efficient way to build AI. But instead, it used a fundamental technique. That's next. [upbeat music] Quanta Magazine is an editorially independent online publication supported by the Simons Foundation to enhance public understanding of science. [upbeat music] DeepSeek’s R1 chatbot sent a ripple through the industry. Most of the attention focused on the fact that a relatively small and unknown company said it had built a chatbot that rivaled the performance of those from the world’s most famous AI companies but using a fraction of the computer power and cost. As a result, the stocks of many Western tech companies plummeted. Nvidia, which sells the chips that run leading AI models, lost more stock value in a single day than any company in history.