Posts
All the articles I've posted.
-
Stop Fine-Tuning. Start Retrieving. (Usually.)
A decision framework for RAG versus fine-tuning that is not "it depends." Three questions settle most of it, and the cases where fine-tuning actually wins are narrower than the budget requests suggest.
-
Evals Are the New Unit Tests (And You're Not Writing Them)
Shipping an LLM feature with no evals is shipping with no tests, and almost everyone is doing it. A small, hand-written harness you run on every change, plus the honest limits of grading with another model.
-
Your RAG Is Bad Because Your Chunking Is Bad
A year into production RAG, the retrieval problems teams keep blaming on the model are almost always chunking, metadata, and document structure. Concrete fixes, with the splitting code I actually run.
-
Five Hundred Engineers, Four Countries, and Conway's Law
Across four countries and a few hundred engineers, the system came to look exactly like the org chart. After years of fighting that, I started using it on purpose.