At CES earlier this year, AMD unveiled its next big product for the data centre market, the Instinct MI300A, offering a new 3D chiplet design and improved CDNA 3 architecture. Now six months later, the MI300X has been announced and will be launching soon. AMD unveiled its next big product for the data centre market, the Instinct MI300 APU, offering a new 3D chiplet design and improved CDNA 3 architecture. Now six months later, the MI300X is ready for launch, alongside the new AMD Instinct platform.
The MI300 is an APU, so it combines Zen 4 CPU cores alongside CDNA 3 graphics. Both are paired with a shared 128GB of HBM3 memory, paving the way for fast and easy shared access to the memory pool. The chip comprises of 5nm and 6nm chiplets, with eight stacks of HBM3 memory. During CES, AMD claimed as much as a 5x improvement in performance per watt for AI tasks with the MI300 versus the older MI250X.
The maximum configuration of the MI300 offers 304 CUs versus the 220 CUs found on the MI250X, as well as up to 96 total CPU cores. In all, AMD is expecting to ship 70,000 MI300A chips in 2023, although 40,000 of those will be going towards the El Capitan supercomputer.
The MI300X will be aimed at companies looking to rapidly scale up AI services – companies like Amazon, Microsoft and Meta come to mind in that regard. In a Falcon demo using a single MI300X accelerator, AMD was able to show how quickly it can process a task. Data Centers with multiple accelerators will be able to process tasks at a rapid rate.
Hardware is just one part of the equation, AMD is also improving its software. ROCm 5 aims to unlock the AMD Instinct Advantage with a comprehensive suite of data center optimisations, out-of-the-box support for leading AI models and frameworks and by creating an open and portable ecosystem.
AMD is also introducing the AMD Instinct Platform today, offering eight MI300X accelerators and 1.5TB of high-bandwidth memory in an easily deployable industry-standard design that can drop into existing infrastructure with minimal changes. This in turn reduces customer's time to market and deployment costs.
The AMD MI300A is already sampling to customers. The MI300X will begin going out to partners in Q3, with mass production set to ramp up in Q4.
Nvidia currently has a strong foothold in the AI market and is rapidly selling its H100 AI accelerators, so there is tough competition ahead.
As part of its Data Center and AI technology presentation today, AMD also announced new updates to its continued partnerships with companies like Amazon (AWS) and Oracle, delivering new cloud instances with improved performance. On Amazon, customers can now opt for the more powerful EC2 M7a instance (50% performance improvement), and Oracle customers will soon get access to “Genoa” powered E5 instances, offering a 33 percent performance per watt uplift compared to E4 instances.
With its new EPYC “Bergamo” chips, AMD is looking to lead the market in cloud native performance, with the new processors offering up to 128 Zen 4c cores with 82 billion transistors. Meta will be amongst the first customers buying up Bergamo chips to power its compute infrastructure. Bergamo chips are now shipping to AMD customers as of today.
Microsoft Azure is also offering Genoa and Genoa-X solutions. Over the past four years, Azure has seen 4x performance improvements using AMD products, which shows how AMD's work in this area has strengthened.
According to AMD's market analysis, the data center AI accelerator hardware market will be worth roughly $150 billion USD by 2027.
Discuss on our Facebook page, HERE.
KitGuru Says: AMD is pushing for a bigger piece of the HPC and AI markets. Nvidia has been gaining ground in this space, so it will be interesting to see comparisons.