Archives

All the articles I've archived.

2026 ¹⁴

July ²

Skynet Has a Ticker Symbol

20 Jul, 2026

The AI labs talk like they are going to run the world. They do not own themselves. One firm sits near the top of the shareholder register at all of them, votes those shares, supplies the risk model the rest of finance looks through, and is buying the power stations and buildings the whole boom physically stands on.
Motto vs Mechanism

2 Jul, 2026

American Express held reserves and paid a subsidiary's debts. Amazon lets bad reviews kill sales. Grab and Shopee have mottos too, and the margin squeeze is deciding what they mean.

June ⁴

OpenClaw and the Agents That Act: What Non-Technical People Should Actually Know

13 Jun, 2026

A plain-English guide to OpenClaw and the new wave of AI that does things instead of just talking. What it is, how it compares to the big-company versions, and the one idea everyone needs before they hand it the keys.
I Asked My AI Agent for an Itemized Bill. It Got Awkward.

13 Jun, 2026

A month of heavy AI agent use, itemized: only a fifth of the tokens wrote code, almost half could run on a cheaper model, and the prompt cache quietly bills you for every coffee break.
What Eighteen Years of Platform Builds Taught Me About AI Hype

9 Jun, 2026

After eighteen years of watching big data, mobile, microservices, cloud, and now AI agents arrive on the same script, here is how I separate the durable capability from the narrative.
Building GenAI for Regulated Industries Without Getting Fired

2 Jun, 2026

Two ways to fail when you ship GenAI into a regulated business: never ship at all, or ship something nobody can audit. The narrow path between them, from building this in financial services and industrial operations.

May ²

MCP a Year In: What Held Up, What Didn't

19 May, 2026

Sixteen months of building production systems on the Model Context Protocol. The interoperability bet paid off. Auth, versioning, and the demo-to-production gap are still where teams bleed.
Fine-Tuning Small Models in 2026: A Practical Pipeline

5 May, 2026

An end-to-end pipeline for fine-tuning a small model in 2026: distill the data, train adapters, hold an eval bar, ship behind a canary, and watch for the drift that quietly eats your accuracy.

April ²

Human-in-the-Loop Is a Design Problem, Not a Safety Net

21 Apr, 2026

Bolting a human approval step onto a badly designed agent does not make it safe. It makes a rubber stamp. The human-in-the-loop work is UX and architecture, not a checkbox.
The Cost of Agents: A FinOps Model for Token-Hungry Systems

7 Apr, 2026

An agent that loops and calls tools spends tokens like a single prompt never does. Here is the cost model that explains the bill, and the levers that cut it without making the agent dumber.

March ²

Eval-Driven Development: How I Actually Build LLM Features Now

24 Mar, 2026

My day-to-day loop for LLM features in 2026: write the eval first, then the prompt, then the code, and fold every production failure back in as a case.
Agentic Transaction Systems: Moving Money With Machines You Can Audit

10 Mar, 2026

An agent that moves money is only acceptable if every decision it makes can be explained, replayed, and pinned to a responsible party after the fact. The audit layer is the product.

February ²

Resolving Stuck Receivables With RAG and Agents

24 Feb, 2026

A production system that resolves stuck accounts-receivable mismatches by retrieving over invoices, contracts, remittances, and email, then proposing a fix a human approves before any money moves.
The Enterprise AI OS: My Thesis for the Next Five Years

10 Feb, 2026

The durable enterprise AI layer is not a model or a chatbot. It is an operating system that gives agents identity, permissions, tools, memory, and an audit trail over the systems a company already runs.

2025 ¹¹

November ¹

Multi-Agent Systems: When One Agent Isn't Enough

18 Nov, 2025

Most multi-agent designs are one agent's job split across five processes that now have to argue with each other. The few cases where splitting actually pays, and the complexity to refuse.

October ¹

The LLM Observability Stack I Wish I'd Built Sooner

21 Oct, 2025

What to instrument for LLM and agent apps before the first incident: full request and tool-call tracing, token and cost per request, latency breakdown, eval scores in production, and turning real failures into eval cases.

September ¹

Anomaly Detection for Cash Flow: Less Magic, More Plumbing

16 Sep, 2025

Catching reconciliation anomalies across institutional financial records, where the model is a rounding error and the matching and normalization is the whole job.

August ¹

Why I Left a Director Seat to Build Again

19 Aug, 2025

I left a senior leadership seat at a large company to found an AI venture and write code again. The pull, the discomfort, and what eighteen years of platform work left me wanting to do.

July ¹

Multimodal AI in the Field: Voice, Image, Form, Action

15 Jul, 2025

A closed-loop field inspection system that turns voice, a photo, and a half-filled form into a structured action, built for places where the network drops for hours.

June ¹

RAG Over Enterprise Records: The Boring Parts That Matter

17 Jun, 2025

Enterprise RAG is trustworthy because of the unglamorous parts: per-user permissions enforced at retrieval, freshness, lineage, and handling records that change. Retrieval is an access-control problem wearing a search costume.

May ¹

Sovereign AI: Running GPUs On-Prem When the Cloud Isn't an Option

20 May, 2025

For regulated workloads where the data legally cannot leave a building, on-prem GPU inference is back. The build-vs-rent math, the constraints nobody prices in, and the software that makes a fixed fleet feel like a platform.

April ¹

Small Fine-Tuned Models Are Beating Frontier on My Workloads

15 Apr, 2025

On narrow, high-volume tasks a fine-tuned small model matches frontier quality at a fraction of the cost and latency. Here is the pipeline, the eval bar, and the maintenance bill nobody quotes you.

March ¹

Agentic Workflows Need Guardrails, Not Vibes

18 Mar, 2025

How to put real constraints around an agent that touches money or production: bounded tools, approval gates on irreversible actions, dry-run modes, spend limits, and a tool-call audit trail you can actually read.

February ¹

Building an MCP Server Fabric for Financial Operations

18 Feb, 2025

Instead of one large agent wired to every financial system, a fabric of small MCP servers, each wrapping one system with tightly scoped tools and an approval gate on anything that writes.

January ¹

MCP Is the USB-C of AI Tools. Here's Why I'm Betting on It.

21 Jan, 2025

The Model Context Protocol standardizes how models reach tools and data, the way a connector standard kills a drawer full of adapters. The ecosystem is thin. I'm betting on the protocol anyway.

2024 ¹²

December ¹

Shipping ML to Twenty Teams: The Platform Bet That Paid Off

10 Dec, 2024

Two years of running a self-service ML platform across twenty-odd product teams. What the paved paths got right, what we built too early, and the only success metric that turned out to matter.

November ¹

Data Residency Is an Architecture Constraint, Not a Checkbox

19 Nov, 2024

A national regulator's residency and sovereignty rules redrew the topology of a payments system. Where data lives, what can leave, and where the keys sit are architecture decisions, not a config flag you toggle at the end.

October ¹

Agents Are Coming. Most Demos Are Lying.

15 Oct, 2024

A skeptical look at agent reliability in late 2024, where the impressive demos quietly fall apart in production, and the narrow places agents already pull their weight.

September ¹

Post-Merger Tech Integration: Ten Systems, Nine Months, Zero Downtime

17 Sep, 2024

Two large platforms merged, and now we owned two of everything. How we collapsed the overlap into one stack without a single customer feeling it, and why the politics were harder than the code.

August ¹

Getting JSON Out of LLMs Without Crying

20 Aug, 2024

Function calling and JSON mode get you syntactically valid JSON. They do nothing about a model that fills the right shape with confident nonsense. The validation-and-repair layer you still have to write.

July ¹

Negotiating a Nine-Figure Cloud Deal: What Engineers Should Know

16 Jul, 2024

A multi-year hyperscaler commitment is an architecture decision wearing a procurement costume, and engineers who skip the room get the bill.

June ¹

The Lakehouse Won. Here's the Migration Nobody Warns You About.

18 Jun, 2024

Moving a multi-petabyte warehouse to an open table format over object storage. The format is a weekend. The operations are the project, and nobody puts that in the deck.

May ¹

Hybrid Search: BM25 and Embeddings Are Better Together

14 May, 2024

Pure vector search quietly fails on the exact terms, codes, and acronyms users actually type. Combining BM25 with dense retrieval, fusing the two, and paying the latency bill it costs.

April ¹

vLLM, Quantization, and Serving LLMs on a Budget

16 Apr, 2024

Self-hosting an open model when GPUs are scarce and finance is reading the bill. Continuous batching, KV-cache, what quantization actually costs you, and when to just call a hosted API instead.

March ¹

Stop Fine-Tuning. Start Retrieving. (Usually.)

19 Mar, 2024

A decision framework for RAG versus fine-tuning that is not "it depends." Three questions settle most of it, and the cases where fine-tuning actually wins are narrower than the budget requests suggest.

February ¹

Evals Are the New Unit Tests (And You're Not Writing Them)

13 Feb, 2024

Shipping an LLM feature with no evals is shipping with no tests, and almost everyone is doing it. A small, hand-written harness you run on every change, plus the honest limits of grading with another model.

January ¹

Your RAG Is Bad Because Your Chunking Is Bad

16 Jan, 2024

A year into production RAG, the retrieval problems teams keep blaming on the model are almost always chunking, metadata, and document structure. Concrete fixes, with the splitting code I actually run.

2023 ¹²

December ¹

Five Hundred Engineers, Four Countries, and Conway's Law

12 Dec, 2023

Across four countries and a few hundred engineers, the system came to look exactly like the org chart. After years of fighting that, I started using it on purpose.

November ¹

BNPL From Scratch: Underwriting, Disbursement, Collections

14 Nov, 2023

Building a buy-now-pay-later product end to end and selling it to banks. The credit model is the easy part. Disbursement, reconciliation, and collections are where it lives or dies.

October ¹

Fraud Detection at Sub-200ms: The Latency Budget Nobody Talks About

17 Oct, 2023

Real-time fraud scoring that lives inside the payment authorization path has a tiny latency budget, and a slightly worse model that fits it beats a better one that does not.

September ¹

A Forecasting Ensemble That Actually Ships

19 Sep, 2023

A demand-forecasting ensemble (a classical statistical model, a sequence model, and gradient boosting) that took accuracy far enough to cut inventory hard, plus the boring data problems that mattered more than the model.

August ¹

Llama 2 Is Here. Should You Self-Host?

15 Aug, 2023

The week Llama 2 dropped, half my inbox asked whether to pull inference in-house. The break-even math, the GPU scarcity, and the on-call tax nobody puts in the spreadsheet.

July ¹

Platform Engineering: Paving Roads vs Building Cages

18 Jul, 2023

An internal ML platform that dozens of teams actually used, and the one test that told me whether I was paving a road or building a cage around it.

June ¹

The Real Cost of a Customer Data Platform

13 Jun, 2023

Unifying a customer record across several business units sounds like a data project. It is mostly an ownership fight, and the maintenance never ends.

May ¹

pgvector vs the Vector DB Gold Rush

9 May, 2023

Most teams adding semantic search this year should start in the Postgres they already run, not a new vector database. Where pgvector holds, where it doesn't, and how to tell which side of the line you are on.

April ¹

Building a RAG Pipeline Before LangChain Was Cool

18 Apr, 2023

A production retrieval pipeline over a few hundred thousand internal documents, hand-rolled in early 2023. The model is the easy part. Retrieval is where the quality lives or dies.

March ¹

Your Recommendation Engine Doesn't Need Deep Learning (Yet)

21 Mar, 2023

Collaborative filtering and plain co-occurrence carried a marketplace recommender to hundreds of millions of recs a day. The exact point where deep ranking earned its complexity, and why most teams reach for it a year early.

February ¹

FinOps Is Just Capacity Planning With a Better Hat

14 Feb, 2023

Owning a large cloud P&L taught me that cost is an engineering-culture problem, not a dashboard. Where the tooling earns its keep and where it is theater.

January ¹

Everyone Wants ChatGPT in Their Product. Most Should Wait.

17 Jan, 2023

Weeks after ChatGPT launched, every exec wants it shipped into the product. Here is the production math most teams have not done yet, and the short list of who should not wait.

2022 ⁶

December ¹

Alternative Credit Scoring When the Bureau Has No File

6 Dec, 2022

Scoring hundreds of thousands of small merchants with no credit-bureau file, where the ensemble was easy and the fairness, reason codes, and feedback loops were the part that took a year.

November ¹

ClickHouse Saved Us Real Money. Here's What It Cost.

15 Nov, 2022

Moving a large analytics workload to ClickHouse cut query latency and the bill hard, but only after we stopped designing tables the way Postgres taught us.

October ¹

Payment Orchestration When Every Method Fails Differently

11 Oct, 2022

A payment orchestration layer across dozens of methods is not a router. It is a failure-handling system, because every method breaks in its own way and you only find out at checkout.

September ¹

Migrating Petabytes Across Four Clouds Without a War Room

20 Sep, 2022

A long migration of hundreds of services and multi-petabyte data across four clouds plus colo, run as boring reversible waves instead of a heroic weekend.

August ¹

The Feature Store Nobody Asked For

9 Aug, 2022

A feature store fixes training/serving skew and lets teams share features, but most adopt it a year early and pay the operational tax for nothing. Here is the line where it flips.

July ¹

Data Mesh Is an Org Chart, Not an Architecture

12 Jul, 2022

Rolling out data mesh across dozens of business units taught me that domain ownership of data is a reporting-line and incentive change first, and a technology choice a distant second.

Archives

Skynet Has a Ticker Symbol

Motto vs Mechanism

OpenClaw and the Agents That Act: What Non-Technical People Should Actually Know

I Asked My AI Agent for an Itemized Bill. It Got Awkward.

What Eighteen Years of Platform Builds Taught Me About AI Hype

Building GenAI for Regulated Industries Without Getting Fired

MCP a Year In: What Held Up, What Didn't

Fine-Tuning Small Models in 2026: A Practical Pipeline

Human-in-the-Loop Is a Design Problem, Not a Safety Net

The Cost of Agents: A FinOps Model for Token-Hungry Systems

Eval-Driven Development: How I Actually Build LLM Features Now

Agentic Transaction Systems: Moving Money With Machines You Can Audit

Resolving Stuck Receivables With RAG and Agents

The Enterprise AI OS: My Thesis for the Next Five Years

Multi-Agent Systems: When One Agent Isn't Enough

The LLM Observability Stack I Wish I'd Built Sooner

Anomaly Detection for Cash Flow: Less Magic, More Plumbing

Why I Left a Director Seat to Build Again

Multimodal AI in the Field: Voice, Image, Form, Action

RAG Over Enterprise Records: The Boring Parts That Matter

Sovereign AI: Running GPUs On-Prem When the Cloud Isn't an Option

Small Fine-Tuned Models Are Beating Frontier on My Workloads

Agentic Workflows Need Guardrails, Not Vibes

Building an MCP Server Fabric for Financial Operations

MCP Is the USB-C of AI Tools. Here's Why I'm Betting on It.

Shipping ML to Twenty Teams: The Platform Bet That Paid Off

Data Residency Is an Architecture Constraint, Not a Checkbox

Agents Are Coming. Most Demos Are Lying.

Post-Merger Tech Integration: Ten Systems, Nine Months, Zero Downtime

Getting JSON Out of LLMs Without Crying

Negotiating a Nine-Figure Cloud Deal: What Engineers Should Know

The Lakehouse Won. Here's the Migration Nobody Warns You About.

Hybrid Search: BM25 and Embeddings Are Better Together

vLLM, Quantization, and Serving LLMs on a Budget

Stop Fine-Tuning. Start Retrieving. (Usually.)

Evals Are the New Unit Tests (And You're Not Writing Them)

Your RAG Is Bad Because Your Chunking Is Bad

Five Hundred Engineers, Four Countries, and Conway's Law

BNPL From Scratch: Underwriting, Disbursement, Collections

Fraud Detection at Sub-200ms: The Latency Budget Nobody Talks About

A Forecasting Ensemble That Actually Ships

Llama 2 Is Here. Should You Self-Host?

Platform Engineering: Paving Roads vs Building Cages

The Real Cost of a Customer Data Platform

pgvector vs the Vector DB Gold Rush

Building a RAG Pipeline Before LangChain Was Cool

Your Recommendation Engine Doesn't Need Deep Learning (Yet)

FinOps Is Just Capacity Planning With a Better Hat

Everyone Wants ChatGPT in Their Product. Most Should Wait.

Alternative Credit Scoring When the Bureau Has No File

ClickHouse Saved Us Real Money. Here's What It Cost.

Payment Orchestration When Every Method Fails Differently

Migrating Petabytes Across Four Clouds Without a War Room

The Feature Store Nobody Asked For

Data Mesh Is an Org Chart, Not an Architecture