At SC25, Dell Technologies announced a set of AI and high performance computing updates that focus on two practical challenges for practitioners: fitting more GPUs into standard racks and delivering data at rates that keep these GPUs fully utilized for large-scale workloads. At the annual supercomputing conference in St. Louis, Dell detailed new PowerEdge systems with dense accelerator configurations, Ethernet-based AI fabrics built on its SONiC networking stack, storage features that offload KV cache and vector search from GPU memory into the storage tier, and updates to rack-level control and cooling at the facility level.
Rack-scale GPU Density and Liquid Cooling
One of the key server updates for system architects is the PowerEdge XE8712, a rack-scale system that Dell describes as a self-monitoring server platform. In a press briefing, Dell said the XE8712’s monitoring features rely on OpenManage, iDRAC, the Integrated Rack Controller and the rack mounted Coolant Distribution Unit, which are meant to provide real-time telemetry and rack level control. Each IR7000 rack can support up to 36 nodes and 144 NVIDIA Blackwell (B200) GPUs, with direct liquid cooling, modular power and integrated networking managed through a single control plane.
The Dell PowerEdge XE8712 server (Image courtesy Dell)
Dell is aiming this design at facilities that need to increase accelerator density without a complete data center rebuild. The company also noted that accelerator density is increasingly limited by facility constraints, which is why it is framing the rack, rather than the individual server, as the main unit of deployment for upcoming AI and HPC systems.
Dell is also extending its AMD-based line with the PowerEdge XE9785 and the liquid cooled XE9785L. Both systems pair dual AMD EPYC processors with eight AMD Instinct MI355X GPUs per node and Pollara 400 AI NICs, and tie into Dell PowerSwitch AI fabrics. The air cooled XE9785 is geared toward sites that are not yet ready for liquid cooling, while the XE9785L focuses on higher density, direct liquid cooled deployments. Dell cites internal MLPerf 5.1 results that show up to 2.7x faster training performance, along with 288 GB of HBM3 per GPU and increased memory bandwidth for larger models and longer sequences.
Ethernet Fabrics and SONiC on Spectrum X
Networking is also a central part of Dell’s SC25 updates, with the company outlining Ethernet-based AI fabrics built on SONiC and high radix switches for large GPU clusters. The company introduced the PowerSwitch Z9964 series, based on the Broadcom Tomahawk 6 ASIC. The switches provide 102.4 Tbps of capacity and will be available in both air cooled and direct liquid cooled versions. Dell said the higher radix allows customers to connect up to 131,072 GPUs in a two-tier, 200 GbE fabric using adaptive routing. This design removes the need for a traditional three tier layout, which reduces the number of switches and links and can lower power, cooling and cabling overhead.
PowerSwitch Z9964 (Image courtesy Dell)
On the software side, Dell’s Enterprise SONiC Distribution now runs on NVIDIA Spectrum X platforms in addition to existing silicon. SmartFabric Manager for SONiC provides a single view of the fabric, with support for end-to-end lossless transport, congestion management and detailed telemetry, including monitoring at the optics level. This is meant to enable multi-vendor Ethernet fabrics to be managed as a single system while keeping the option to change silicon suppliers over time.
Data Engines and KV Cache Offload
The third major focus is data, and specifically the bottlenecks that appear when long context language models meet GPU memory limits. Dell and Nvidia have worked together to build a KV cache offload path that uses Dell’s storage systems as a backing tier. PowerScale and ObjectScale, Dell’s file and object storage engines, now integrate with Nvidia’s NIXL library, part of the Dynamo framework, to move KV cache from GPU memory into the storage layer. A Dell connector ties NIXL into the storage engines.
According to Dell’s internal testing, the system produces its first output token in about one second at a 131,000 token context window, which the company says is roughly 19x faster than a baseline vLLM configuration that took more than 17 seconds at the same context length. For practitioners running long context models for tasks such as document analysis and retrieval augmented generation, this type of offload could shift the balance between GPU count, memory capacity and storage performance.
Dell is also adding new capabilities to these file and object storage engines to support parallel access and AI-centric search. On the file side, PowerScale is gaining support for parallel NFS with Flex Files, which opens multiple data paths into a cluster. Parallel I/O across several paths can reduce contention on metadata and increase throughput, which is critical for training nodes that would otherwise sit idle waiting for data. Dell also plans to offer PowerScale software as a standalone license on qualified PowerEdge servers, starting with an AMD-based, all-flash configuration. The decoupled model is intended to let sites adopt newer CPUs and NVMe drives while keeping a consistent file system.
ObjectScale is gaining two new APIs, S3 Tables and S3 Vector. S3 Tables lets users manage structured data directly on the object store instead of moving it into a separate database, with integrations for Spark and Trino. Dell reports internal results that show up to twice the ingest rate and more than 4x faster query performance for some workloads. S3 Vector brings vector search into the storage layer, with early results indicating roughly one second query times over billions of vectors. Both features are aimed at AI workloads that involve a mix of structured tables, unstructured objects and vector-based search.
PowerCool, Dell’s rack mounted coolant distribution unit rated for up to 160 kW of liquid cooling (Image courtesy Dell)
Rack-level Control and Facility Monitoring
The SC25 announcements also include updates to rack-level control and cooling, aimed at facility teams dealing with the power and cooling monitoring demands of dense GPU racks. The Integrated Rack Controller now ships with Dell Integrated Rack Scalable Systems and ties into OpenManage Enterprise and iDRAC. It provides rack telemetry and can detect and respond to liquid leaks, with sensitivity down to tens of microliters, Dell said. OpenManage Enterprise has also been expanded to manage up to 25,000 devices from a single console.
Dell also introduced the PowerCool rack mounted coolant distribution unit, rated for up to 160 kW of rack cooling and validated for dense deployments, including racks with Nvidia and AMD GPU configurations. The unit can be monitored and maintained through Dell’s existing service infrastructure.
For HPC sites that are already power and cooling constrained, these operational features may prove to be as important as the server and storage updates. This shapes day-to-day operations, since high-density GPU racks place as much demand on facilities as they do on compute hardware. Dell’s overall message with its SC25 announcements is that rack-scale integration, telemetry and cooling must also be considered alongside server performance when building AI and HPC systems.
The post Dell’s SC25 Updates Address Constraints of Modern AI and HPC Systems appeared first on AIwire.
