<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Faizan · Writing</title><description>Notes on real-time LLM systems, retrieval, and agent safety.</description><link>https://faizanraza.dev/</link><language>en-us</language><item><title>Steady State</title><link>https://faizanraza.dev/writing/steady-state/</link><guid isPermaLink="true">https://faizanraza.dev/writing/steady-state/</guid><description>Six months, a model release most weeks, and the ground never stopped moving. The through-line: production AI stopped being a modeling problem and became a control-systems problem.</description><pubDate>Sat, 04 Jul 2026 00:00:00 GMT</pubDate><category>LLM systems</category><category>control systems</category><category>reliability</category><category>manifesto</category></item><item><title>The Frontier Is Now a Menu</title><link>https://faizanraza.dev/writing/the-frontier-is-a-menu/</link><guid isPermaLink="true">https://faizanraza.dev/writing/the-frontier-is-a-menu/</guid><description>GPT-5.6 shipped as three tiers. Claude comes in Fable and Sonnet. The labs unbundled &apos;the best model&apos; into a price-quality menu, and your new job is per-request capital allocation.</description><pubDate>Wed, 01 Jul 2026 00:00:00 GMT</pubDate><category>LLM systems</category><category>routing</category><category>economics</category><category>strategy</category></item><item><title>Agentjacking Was Inevitable</title><link>https://faizanraza.dev/writing/agentjacking-was-inevitable/</link><guid isPermaLink="true">https://faizanraza.dev/writing/agentjacking-was-inevitable/</guid><description>Fake Sentry errors hijacked coding agents at 2,388 orgs with an 85% success rate. No malware, no phishing, every step authorized. This is what happens when data and instructions share a channel.</description><pubDate>Thu, 18 Jun 2026 00:00:00 GMT</pubDate><category>security</category><category>agents</category><category>prompt injection</category><category>production</category></item><item><title>Long Context Didn&apos;t Kill Retrieval</title><link>https://faizanraza.dev/writing/long-context-didnt-kill-retrieval/</link><guid isPermaLink="true">https://faizanraza.dev/writing/long-context-didnt-kill-retrieval/</guid><description>Million-token windows killed lazy retrieval, not retrieval. Context is a budget you allocate under a latency SLO, and &apos;stuff everything in&apos; is the least defensible allocation there is.</description><pubDate>Thu, 04 Jun 2026 00:00:00 GMT</pubDate><category>RAG</category><category>long context</category><category>latency</category><category>LLM systems</category></item><item><title>A Hundred Agents Is Not a Plan</title><link>https://faizanraza.dev/writing/a-hundred-agents-is-not-a-plan/</link><guid isPermaLink="true">https://faizanraza.dev/writing/a-hundred-agents-is-not-a-plan/</guid><description>Swarm architectures multiply an unreliable unit and call it scale. The math of chained success rates says the opposite: fewer agents, tighter loops, structural correction.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><category>agents</category><category>multi-agent</category><category>reliability</category><category>architecture</category></item><item><title>The Subsidy Era Is Ending</title><link>https://faizanraza.dev/writing/the-subsidy-era-is-ending/</link><guid isPermaLink="true">https://faizanraza.dev/writing/the-subsidy-era-is-ending/</guid><description>Anthropic is projecting its first operating profit and OpenAI is reportedly prepping an S-1. When your suppliers start caring about margins, your architecture inherits the problem.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><category>economics</category><category>industry</category><category>strategy</category><category>LLM systems</category></item><item><title>Assume the Benchmark Is Gamed</title><link>https://faizanraza.dev/writing/the-benchmark-is-gamed/</link><guid isPermaLink="true">https://faizanraza.dev/writing/the-benchmark-is-gamed/</guid><description>Berkeley researchers showed every major agent benchmark can be exploited to near-perfect scores. Production telemetry says deployed agents succeed 56.6% of the time. Measure like an SRE instead.</description><pubDate>Thu, 30 Apr 2026 00:00:00 GMT</pubDate><category>evals</category><category>agents</category><category>benchmarks</category><category>reliability</category></item><item><title>Pilots Don&apos;t Die in Demos. They Die in Month Three.</title><link>https://faizanraza.dev/writing/pilots-die-in-month-three/</link><guid isPermaLink="true">https://faizanraza.dev/writing/pilots-die-in-month-three/</guid><description>88% of enterprise agent pilots never reach production. The autopsy almost never says &apos;the model was too dumb.&apos; It says nobody built the loop that keeps a working system working.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>agents</category><category>production</category><category>enterprise</category><category>reliability</category></item><item><title>Token Prices Are Collapsing. Your AI Bill Isn&apos;t.</title><link>https://faizanraza.dev/writing/cost-per-successful-task/</link><guid isPermaLink="true">https://faizanraza.dev/writing/cost-per-successful-task/</guid><description>Prices per token fall up to 900x a year while agentic tasks burn 5-30x more tokens. The metric that decides your unit economics is cost per successful task, and its biggest lever is reliability.</description><pubDate>Tue, 24 Mar 2026 00:00:00 GMT</pubDate><category>economics</category><category>agents</category><category>reliability</category><category>LLM systems</category></item><item><title>MCP Won. Now Comes the Hard Part.</title><link>https://faizanraza.dev/writing/mcp-won-now-the-hard-part/</link><guid isPermaLink="true">https://faizanraza.dev/writing/mcp-won-now-the-hard-part/</guid><description>97 million monthly downloads, 9,400 servers, 30+ CVEs in eight weeks. The protocol standardized the easy 20%. Production is the other 80%, and I&apos;ve shipped it.</description><pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate><category>MCP</category><category>agents</category><category>security</category><category>production</category><category>LLM systems</category></item><item><title>The Model Is Not Your Moat</title><link>https://faizanraza.dev/writing/the-model-is-not-your-moat/</link><guid isPermaLink="true">https://faizanraza.dev/writing/the-model-is-not-your-moat/</guid><description>A dozen frontier releases in 28 days means a lead now has a half-life of weeks. The durable asset is everything around the model: evals, routing, data, rollback.</description><pubDate>Thu, 26 Feb 2026 00:00:00 GMT</pubDate><category>LLM systems</category><category>strategy</category><category>evals</category><category>industry</category></item><item><title>Designing SLO-Aware RAG</title><link>https://faizanraza.dev/writing/designing-slo-aware-rag/</link><guid isPermaLink="true">https://faizanraza.dev/writing/designing-slo-aware-rag/</guid><description>Why production retrieval should treat latency and cost as constraints to control, not numbers to hope for, and how difficulty-adaptive routing gets there.</description><pubDate>Tue, 10 Feb 2026 00:00:00 GMT</pubDate><category>RAG</category><category>LLM systems</category><category>latency</category><category>SLOs</category></item><item><title>A Year After R1</title><link>https://faizanraza.dev/writing/a-year-after-r1/</link><guid isPermaLink="true">https://faizanraza.dev/writing/a-year-after-r1/</guid><description>DeepSeek-R1 was a pricing event disguised as a research event. One year later, the weights are the weapon and the price floor is the wound.</description><pubDate>Tue, 27 Jan 2026 00:00:00 GMT</pubDate><category>open weights</category><category>economics</category><category>LLM systems</category><category>industry</category></item></channel></rss>