Superior performance: InfiniBand offers high data transfer rates, ranging from 10 to 400Gb/s. Most of the world’s fastest supercomputers leverage InfiniBand, connecting 63 of the top 100 systems, with a total of 200 InfiniBand-connected systems appearing on the June 2023 TOP500 list. This high bandwidth is also important for AI applications that involve large-scale data processing and analysis, such as deep learning and training neural networks on vast datasets.
Low latency: InfiniBand’s ultra-low latencies, with measured delays of 600ns end-to-end, accelerate today’s mainstream HPC and AI applications. InfiniBand’s ultra low latency ensures that data can be quickly transferred between various components of a distributed AI system, such as GPUs, CPUs, and storage devices, without significant delays.
High efficiency: InfiniBand provides support of advanced reliable transport protocols such as Remote Direct Memory Access (RDMA) to ensure the highest efficiency of customer workload processing. RDMA significantly reduces CPU overhead and latency, making it well-suited for AI and HPC workloads that involve frequent data exchanges between nodes.
Cost effectiveness: InfiniBand Host Channel Adapters (HCAs) and switches are competitively priced and create a compelling price/performance advantage over alternative technologies.
Fabric consolidation and low energy usage: InfiniBand can consolidate networking, clustering, and storage data over a single fabric which significantly lowers the overall power, real estate and management overhead required for servers and storage. To support the ever-increasing deployment of virtualized solutions, InfiniBand can address multiple virtual machines connected to a single physical port. This enables a more efficient view of each logical endpoint, significantly reducing the burden on the subnet manager.
Reliable, stable connections: InfiniBand is perfectly suited to meet the mission-critical needs of today’s enterprise by enabling fully redundant and lossless I/O fabrics. This includes robust error detection and correction mechanisms to ensure reliable data transmission, and automatic path failover and link layer multi-pathing abilities to meet the highest levels of availability. It also supports various features like hot-swapping and failover mechanisms, which contribute to high availability and fault tolerance in AI systems.
Data integrity: InfiniBand enables the highest levels of data integrity by performing cyclic redundancy checks (CRCs) at each fabric hop and end-to-end across the fabric to ensure the data is correctly transferred.
Rich, growing ecosystem: InfiniBand has a well-established ecosystem and community support, with a wide range of hardware and software vendors offering compatible products and solutions. This makes it easier for researchers, developers and IT infrastructure engineers to access the necessary tools and technologies for building high-performance AI and scientific computing systems.
Highly interoperable environment: InfiniBand Compliance and Interoperability testing conducted by the IBTA, results in a highly interoperable environment, which benefits end users in terms of product choice and vendor independence.