Posts

All the articles I've posted.

Stop Fine-Tuning. Start Retrieving. (Usually.)

19 Mar, 2024

A decision framework for RAG versus fine-tuning that is not "it depends." Three questions settle most of it, and the cases where fine-tuning actually wins are narrower than the budget requests suggest.
Evals Are the New Unit Tests (And You're Not Writing Them)

13 Feb, 2024

Shipping an LLM feature with no evals is shipping with no tests, and almost everyone is doing it. A small, hand-written harness you run on every change, plus the honest limits of grading with another model.
Your RAG Is Bad Because Your Chunking Is Bad

16 Jan, 2024

A year into production RAG, the retrieval problems teams keep blaming on the model are almost always chunking, metadata, and document structure. Concrete fixes, with the splitting code I actually run.
Five Hundred Engineers, Four Countries, and Conway's Law

12 Dec, 2023

Across four countries and a few hundred engineers, the system came to look exactly like the org chart. After years of fighting that, I started using it on purpose.

Stop Fine-Tuning. Start Retrieving. (Usually.)