The data black hole at the center of AI
6/19/202612 min
Read the transcript here.
Thanks to Mercury for sponsoring this essay!
Mercury just released a new feature called Command, which gives me AI right in my banking platform. And since I use Mercury to run basically my entire business, Command has access to all the info it needs to get real work done. I can ask it to send invoices, or categorize expenses, or even transfer money… and Command just handles it. Learn more at mercury.com/command
Timestamps:
(00:00:00) – What is really driving AI progress?
(00:03:11) – Comparing human vs AI sample efficiency
(00:08:46) – Does sample efficiency matter?
Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Transcript preview
First 90 secondsDwarkesh Patel· Host0:00
So one definition of intelligence is sample efficiency. That is to say, how much data do you need in a given domain to operate fluently and competently? And it's actually not clear that we've made that much progress in training sample efficiency over the last few years. It seems like more so we've just dramatically widened and improved the data distribution. The main way that AIs have been getting better is from adding more and better data and scaling the compute required to develop that data in the first place. Obviously, RL is the main way that this has happened. You can think of RL as basically a kind of synthetic data generation, where you dump a ton of compute against a verifier or a rubric if you have another one as a judge, and you do this in order to find out what the good data is in the first place. And then you train your model to predict these correct rollouts, much in the same way that you might train that model to predict the next word in internet text. For this process to work, the model must have at least some prior probability to anticipate the correct solution in the first place, which is why you need mind-stretching amounts of human expert trajectories in every single field and skill that you want the model to eventually be competent in. It's hard to overstate how task-specific and bespoke this human expert data is. If you want some intuition, I recommend checking out the job descriptions on Mercor or Surge's websites. There are listings for word specialists who will convert legacy documents into polished Word files, and legal experts who will write realistic M&A diligences or securities filings, and management consultants who will write up template market research. And it is not only that the data have to be so domain-specific, but there has to be so much of it. Each skill corresponds to at least