Ai – {IT}

A lone sailboat on a calm, wide-open sea

The 1M context window, and what it actually costs you 💸

A 1M-token LLM context window is a tool, not a target. How context actually works, why long threads cost more on every turn, and when to start a fresh chat versus keep going, with practical Claude Code tips and /context.

A minimal LLM Ops stack with tracing and model costs

Building a minimal LLM Ops stack: a FastAPI “customer support reply drafter” instrumented with Langfuse for request tracing, grounded retrieval, and per-request model cost tracking, so every LLM call is inspectable.