LLM – {IT}

Two brown leaf-mimic butterflies on white celosia flower spikes, one in the foreground left, the other at the top right of the frame

The literal reader

Why writing culture matters more in the AI era: LLM agents are “literal readers” that do exactly what your docs say and fill in none of the gaps a human silently patches. Built on a shared STATUS.md, the case for writing things down, and where it tips into over-documenting.

A lone sailboat on a calm, wide-open sea

The 1M context window, and what it actually costs you 💸

A 1M-token LLM context window is a tool, not a target. How context actually works, why long threads cost more on every turn, and when to start a fresh chat versus keep going, with practical Claude Code tips and /context.

STATUS.md: a shared file for multi-agent work

A file-based pattern for coordinating multiple LLM agents on one task: a single shared, per-feature STATUS.md the agents read and write under explicit rules, how to make them follow a protocol and survive context loss, and when not to use it.

A minimal LLM Ops stack with tracing and model costs

Building a minimal LLM Ops stack: a FastAPI “customer support reply drafter” instrumented with Langfuse for request tracing, grounded retrieval, and per-request model cost tracking, so every LLM call is inspectable.

RAG: A (mostly) no-buzzword explanation

Retrieval-Augmented Generation (RAG) explained without buzzwords: how it gives an LLM the right data at answer time to fix stale knowledge and hallucinations, the step-by-step flow, its benefits over fine-tuning, and when RAG is not the answer.