A minimal LLM Ops stack with tracing and model costs

A minimal LLM Ops stack with tracing and model costs

Building a minimal LLM Ops stack: a FastAPI “customer support reply drafter” instrumented with Langfuse for request tracing, grounded retrieval, and per-request model cost tracking, so every LLM call is inspectable.

January 14, 2026 · 11 min