If the assumption that we've lived with for years that scale brings lower marginal costs isn’t true, we really have to rethink a lot of things. Anjali Shrivastava is an independent researcher and data scientist looking into the unit economics of AI—probing how the seemingly simple token unit of cost for generative AI services is actually highly variable, leading to hidden margin risks that undermine conventional subscription or usage pricing models. Anjali joined Tim to talk through her recent work, particularly as it relates to code-generation tools like Cursor and Claude Code. Over the course of a very interesting hour, they got into the distinctions between input tokens and output tokens and between reasoning tokens and output tokens, and why they matter; the challenges providers face in instituting systems to offer visibility for users, and even for themselves—and why it’s not only a business problem to solve but one at the heart of AI products as they currently exist; how AI service providers differ from traditional SaaS businesses in ways that make the latter’s pricing model obsolete (even though as of now, it’s the model that AI providers continue to use); and how AI providers can balance, and even temper, demand as use scales. They also discussed the efficiency opportunities new hardware may provide, one potential future where AI gets much pricier, why efficiency alone won’t solve the pricing issue, how external bottlenecks, like electricity, influence pricing, and much more. Watch it now. Follow O'Reilly on: LinkedIn: Facebook: Instagram: BlueSky:











