About InfiniBand™

InfiniBand is an industry-standard specification that defines an input/output architecture used to interconnect servers, communications infrastructure equipment, storage and embedded systems.

A true fabric architecture, InfiniBand leverages switched, point-to-point channels with data transfers that generally lead the industry, both in chassis backplane applications as well as through external copper and optical fiber connections. Reliable messaging (send/receive) and memory manipulation semantics (RDMA) without software intervention in the data movement path ensure the lowest latency and highest application performance.

This low-latency, high-bandwidth interconnect requires only minimal processing overhead and is ideal to carry multiple traffic types (clustering, communications, storage, management) over a single connection. As a mature and field-proven technology, InfiniBand is used in thousands of data centers for both HPC and AI clusters that efficiently scale up to thousands of nodes. This scalability is crucial for frameworks that require distributed processing across multiple compute nodes to handle complex and computationally intensive tasks. Through the availability of long reach InfiniBand over Metro and WAN technologies, InfiniBand is able to extend RDMA performance between data centers, across campus, to around the globe.

NDR 400Gb/s InfiniBand is shipping today, and has a robust roadmap defining increased speeds well into the future. The current roadmap shows a projected demand for higher bandwidth with GDR 1.6Tb/s InfiniBand products planned for 2028 timeframe.

Advantages

Superior performance: InfiniBand offers high data transfer rates, ranging from 10 to 400Gb/s. Most of the world’s fastest supercomputers leverage InfiniBand, connecting 63 of the top 100 systems, with a total of 200 InfiniBand-connected systems appearing on the June 2023 TOP500 list. This high bandwidth is also important for AI applications that involve large-scale data processing and analysis, such as deep learning and training neural networks on vast datasets.

Low latency: InfiniBand’s ultra-low latencies, with measured delays of 600ns end-to-end, accelerate today’s mainstream HPC and AI applications. InfiniBand’s ultra low latency ensures that data can be quickly transferred between various components of a distributed AI system, such as GPUs, CPUs, and storage devices, without significant delays.

High efficiency: InfiniBand provides support of advanced reliable transport protocols such as Remote Direct Memory Access (RDMA) to ensure the highest efficiency of customer workload processing. RDMA significantly reduces CPU overhead and latency, making it well-suited for AI and HPC workloads that involve frequent data exchanges between nodes.

Cost effectiveness: InfiniBand Host Channel Adapters (HCAs) and switches are competitively priced and create a compelling price/performance advantage over alternative technologies.

Fabric consolidation and low energy usage: InfiniBand can consolidate networking, clustering, and storage data over a single fabric which significantly lowers the overall power, real estate and management overhead required for servers and storage. To support the ever-increasing deployment of virtualized solutions, InfiniBand can address multiple virtual machines connected to a single physical port. This enables a more efficient view of each logical endpoint, significantly reducing the burden on the subnet manager.

Reliable, stable connections: InfiniBand is perfectly suited to meet the mission-critical needs of today’s enterprise by enabling fully redundant and lossless I/O fabrics. This includes robust error detection and correction mechanisms to ensure reliable data transmission, and automatic path failover and link layer multi-pathing abilities to meet the highest levels of availability. It also supports various features like hot-swapping and failover mechanisms, which contribute to high availability and fault tolerance in AI systems.

Data integrity: InfiniBand enables the highest levels of data integrity by performing cyclic redundancy checks (CRCs) at each fabric hop and end-to-end across the fabric to ensure the data is correctly transferred.

Rich, growing ecosystem: InfiniBand has a well-established ecosystem and community support, with a wide range of hardware and software vendors offering compatible products and solutions. This makes it easier for researchers, developers and IT infrastructure engineers to access the necessary tools and technologies for building high-performance AI and scientific computing systems.

Highly interoperable environment: InfiniBand Compliance and Interoperability testing conducted by the IBTA, results in a highly interoperable environment, which benefits end users in terms of product choice and vendor independence.