The memory wall on context
1:53:07–1:54:29 · 83s
Reiner argues context length stalls around 100–200K because of memory bandwidth and capacity, not compute—sparse attention only helps so much.
1:53:07–1:54:29 · 83s
Reiner argues context length stalls around 100–200K because of memory bandwidth and capacity, not compute—sparse attention only helps so much.
We use cookies to understand how you use our platform and to improve your experience. Click "Accept All" to consent, or "Decline non-essential" to opt out of non-essential cookies. Read our Privacy Policy.