Gradient Descent Inside Prompts
14:48–15:31 · 43s
He suggests in-context learning might literally run tiny gradient-descent loops within a transformer’s layers.
14:48–15:31 · 43s
He suggests in-context learning might literally run tiny gradient-descent loops within a transformer’s layers.
We use cookies to understand how you use our platform and to improve your experience. Click "Accept All" to consent, or "Decline non-essential" to opt out of non-essential cookies. Read our Privacy Policy.