
If your argument rests on a 2020 pre-ChatGPT paper and ignores the last three years of hard, practical advances that *change the economics of scaling*, you don’t have an argument.
Publisher: drli.blog (Dr. Robert Li)
That title is a pratfall. “Compute‑Efficient Frontier, the limits of current generation LLMs and is that really a bad thing?” promises hard limits and delivers soft pudding. If you say “frontier,” show a boundary with numbers, methods, and costs. The edge moves because the math, the data, and the hardware keep getting better. The post flirts with clever caution, “we can leave research to the researchers”, yet selectively ignores research that contradicts the thesis. If you take a contrarian position, you must engage the strongest counter-arguments and recent evidence, not cherry-pick comforting historical metaphors
The post misreads Kaplan. Those scaling curves were measurements, not some gravestone for progress. Then it ignores the obvious follow‑ups that changed the budget math. Chinchilla showed how to split compute between model size and data so you don’t burn money on bloated weights. Retrieval lets you pull facts instead of memorizing everything. Sparse Mixture‑of‑Experts routes tokens to only a few specialists. Parameter‑efficient fine‑tuning like LoRA lets you adapt models without retraining the planet. Distillation and quantization cut inference cost. These are not hype slides. They are why the “frontier” keeps shifting.
The data doomsaying is lazy. “We’ve consumed the Internet” is a slogan, not a fact. Data isn’t a single pile of stale web text. Private corpora exist. Synthetic data can be generated and filtered. Retrieval and caching push long tails out of the model and into storage. Quality beats quantity more often than the hand‑wringers admit. If you want to claim a terminal shortage, bring evidence.
Tone and authority are a mess. Dr. Li drops a credential and slides into sweeping history as cover for not doing the recent reading. If your argument rests on a 2020 pre-ChatGPT paper and ignores the last three years of hard, practical advances that *change the economics of scaling*, you don’t have an argument. If you want to be the sober adult, engage the strongest counterexamples from the last three years and show numbers.





