A minimal LLM Ops stack with tracing and model costs January 14, 2026January 14, 2026 by Igor Computers 0 Comments I built a minimal FastAPI “customer support reply drafter” with TF-IDF retrieval and Langfuse tracing. You’ll see exactly what context the model used, where latency came from, and what each request cost, plus the trade-offs behind the design. Read More