<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Ai on {IT}</title><link>https://igortkanov.com/computers/ai/</link><description>Recent content in Ai on {IT}</description><generator>Hugo</generator><language>en-us</language><copyright>Copyright © 2026 {IT}. All rights reserved. Unless otherwise stated, all text, images, diagrams, and other original content on this blog may not be reproduced, distributed, or used without prior written permission.</copyright><lastBuildDate>Wed, 14 Jan 2026 12:30:35 +0000</lastBuildDate><atom:link href="https://igortkanov.com/computers/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>A minimal LLM Ops stack with tracing and model costs</title><link>https://igortkanov.com/minimal-llm-ops-stack-with-tracing-and-model-costs-langfuse/</link><pubDate>Wed, 14 Jan 2026 12:30:35 +0000</pubDate><guid>https://igortkanov.com/minimal-llm-ops-stack-with-tracing-and-model-costs-langfuse/</guid><description>&lt;p&gt;Many &amp;ldquo;LLM app&amp;rdquo; demos stop the moment the model produces a decent-looking answer. However, when the app becomes more real, you get extra questions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What &lt;strong&gt;context&lt;/strong&gt; did the model actually see?&lt;/li&gt;
&lt;li&gt;Did &lt;strong&gt;retrieval&lt;/strong&gt; find anything useful. Or nothing at all?&lt;/li&gt;
&lt;li&gt;What did this request &lt;strong&gt;cost&lt;/strong&gt;? How do you compare it to another request?&lt;/li&gt;
&lt;li&gt;Did a &amp;ldquo;small prompt tweak&amp;rdquo; quietly &lt;strong&gt;break&lt;/strong&gt; refund handling?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In an attempt to make those questions easier to answer, I built a tiny &lt;strong&gt;&lt;a href="https://fastapi.tiangolo.com" target="_blank" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;&lt;/strong&gt; &amp;ldquo;customer support reply drafter&amp;rdquo; app and integrated it with Langfuse. The goal was to have a workflow where &lt;strong&gt;every request leaves a trail&lt;/strong&gt; you can inspect, and where changes are measurable.&lt;/p&gt;</description></item></channel></rss>