Local LLM Hardware in 2025: prices and token per second for NVIDIA, Apple, AMD, Intel…

In this presentation I give an overview of the hardware you can buy on the market today for running large language models locally. Specifically I discuss the following systems: * Apple Mac Studio M3 Ultra 512 GB & Apple MacBook Pro M4 Max * NVIDIA DGX Spark and DGX Station (also known as ASUS Ascent GX10) * NVIDIA RTX 6000 Blackwell Generation, RTX 5090, RTX 4090, RTX 3090 * NVIDIA Jetson AGX Orin * Intel project Battlematrix * AMD Ryzen AI Max+ Pro 395 (Strix Halo) * AMD Radeon Pro W7900 All of these systems are presented with their purchase prices, some general insights, and also I predict token/s values for two different Llama models (Llama 3.3 70B Q8 and LLama 3.1 405 B 4bit quant). Will it fit? I encourage our viewers to comment below about any hardware I might have missed out on, or if your token/s values / purchase price experience varies. Subscribe to our newsletter to get this presentation: And if you want a deep dive with your company, book a free 30 min strategy call now (limited seats available each month): Subscribe to my channel for more tips on AI for managers, entrepreneurs and business people, upcoming AI tools which will save you time and make you money.

Local LLM Hardware in 2025: prices and token per second for NVIDIA, Apple, AMD, Intel Battlematrix

Похожее видео