Ultra Ethernet Consortium Formed, Plans to Adapt Ethernet for AI and HPC Needs
A new Ultra Ethernet Consortium is set to redefine Ethernet technology to meet the growing demands of high-performance computing (HPC) and artificial intelligence (AI) workloads. The consortium, overseen by the Linux Foundation and backed by industry leaders including AMD, Intel, Microsoft, and Cisco, aims to develop the Ultra Ethernet Transport (UET) protocol to address the low latency and scalability requirements of AI and HPC systems.
With AI models becoming increasingly complex, the need for larger clusters and faster networking capabilities is critical. The Ultra Ethernet Consortium aims to enhance Ethernet technology to ensure optimal performance for these data-intensive workloads.
Dr. Earl Joseph, CEO of Hyperion Research, highlighted the challenges faced by HPC and AI users, stating that weaknesses in system interconnect capabilities have limited their performance.
To achieve their goals, the Ultra Ethernet Consortium plans to refine Ethernet technology while ensuring cost efficiency and interoperability. This will involve improvements to both the software and physical layers of Ethernet, without altering its fundamental structure.
The consortium’s technical goals include developing specifications, APIs, and source code for Ultra Ethernet communications. They will update existing link and transport protocols, create new telemetry and congestion mechanisms, and address the specific needs of AI and HPC clusters through separate profiles within the UET protocol.
Justin Hotard, EVP and General Manager of HPC & AI at Hewlett Packard Enterprise, emphasized the importance of developing an open, scalable, and cost-effective Ethernet-based communication stack to support high-performance workloads.
The Ultra Ethernet Consortium will focus on refining the Physical Layer, Link Layer, Transport Layer, and Software Layer of Ethernet technology. The consortium’s founding members, who are leaders in CPU and GPU design, network infrastructure, and supercomputing, bring valuable experience to the effort.
While the consortium does not explicitly mention any competing technologies, it is anticipated that Ultra Ethernet will directly compete with InfiniBand, the preferred networking technology for low-latency, HPC-style networks. NVIDIA, a major user of both Ethernet and InfiniBand, is noticeably absent from the consortium.
The UEC members are already strategizing how to integrate the upcoming UET technology into their products, with AMD’s CTO, Mark Papermaster, highlighting the benefits of reduced congestion and improved security.
The Ultra Ethernet Consortium, hosted by the Linux Foundation, will continue its work through four dedicated working groups focused on different aspects of Ethernet technology.
While the consortium has not provided an estimated timeline for finalizing the UET specification, it is expected that certification from the IEEE will be sought, adding another layer of validation.
The Ultra Ethernet Consortium has expressed its intention to accept new members starting from Q4 2023, inviting interested technology giants involved in AI and HPC work to join their mission.
Overall, the Ultra Ethernet Consortium’s goal to adapt Ethernet technology for AI and HPC needs demonstrates the industry’s commitment to enhancing networking capabilities to support the increasingly demanding requirements of data-intensive workloads. Through collaboration and refinement, the consortium aims to drive innovation and enable efficient operation of high-performance clusters.