home

writing

I like writing about things I'm actually building — real numbers from real projects, honest about what worked and what didn't. Each collection is a long-running series on one project. New ones show up as I start working on new things.

last updated · may 19, 2026 actively shipping
collections
what is inference, actually?
1 article · standalone

A walk through what actually happens between hitting enter on a chatbot and the answer finishing. Anchored to Llama 3.1 8B on an H100, end to end: tokenization, embedding, all 32 transformer layers, the decode loop, and where every inference company is competing for an edge.

inference transformers gpu
search engine in rust
14 articles · in progress

Started as a chapter-by-chapter walk through Manning's Introduction to Information Retrieval, building each piece in Rust on the 20 Newsgroups corpus. Now running BM25F over AWS documentation — 14,266 markdown files across 18 services. Field-aware scoring across title, headers, code, body. CamelCase + underscore tokenization so API_RunInstances hits "run instances" too. Next: PageRank.

rust information retrieval bm25f
just shipped BM25F + CamelCase tokenization · 3 new pieces
filmsearch
2 articles · complete

A semantic movie search engine. The first attempt failed — I was throwing LLMs at retrieval without understanding retrieval. The second attempt worked: BM25 from scratch with multi-zone ranking. A startup founder used it to find a Japanese animated film he couldn't find anywhere else.

bm25 semantic search failed → shipped
what's coming
I take on one project at a time and write through it. When the Rust search engine wraps, the next collection shows up here.