InfiniBand™ Architecture Specification

Frequently Asked Questions

InfiniBand is an industry standard, channel-based, switched fabric interconnect architecture for server and storage connectivity.

High-performance applications – such as bioscience and drug research, data mining, digital rendering, electronic design automation, fluid dynamics and weather analysis – require high-performance message passing and I/O to accelerate computation and storage of large datasets.

AI workloads, particularly those involving large and complex models, are computationally intensive. To expedite model training and processing vast datasets, AI practitioners have turned to distributed computing. This approach involves distributing the workload across multiple interconnected servers or nodes connected via a high-speed, low-latency network.

InfiniBand technology has been a driving force behind large-scale supercomputing deployments for complex distributed scientific computing and has become the de-facto network for training large, complex models. With ultra-low latencies, InfiniBand has become a linchpin for accelerating today’s mainstream scientific computing and AI applications.

In addition, enterprise applications – such as customer relationship management, fraud detection, database, virtualization and web services, as well as key vertical markets such as financial services, insurance services and retail – demand the highest possible performance from their computing systems.

The combination of InfiniBand interconnect solutions and servers and storage integrated with multi-core processors and accelerated compute delivers optimum performance to meet these challenges.

InfiniBand offers multiple levels of link performance, which currently reaches speeds as high as 800Gb/s. Each of these link speeds also provides low-latency communication within the fabric, enabling higher aggregate throughput than other protocols. This uniquely positions InfiniBand as the ideal data center I/O interconnect.

Growth of multi-core CPU-based servers and use of multiple virtual machines are driving the need for more I/O connectivity per physical server. Typical VMware ESX server environments, for example, require use of multiple Ethernet NICs and Fibre Channel HBAs. This increases I/O cost, cabling and management complexity.

InfiniBand I/O virtualization solves these problems by providing unified I/O on the compute server farm, enabling significantly higher LAN and SAN performance from virtual machines. It allows for effective segregation of the compute, LAN and SAN domains to enable independent scaling of resource. The result is a more change-ready virtual infrastructure.

Finally, in VMware ESX environments, the virtual machines, applications and vCenter-based infrastructure management operate on familiar NIC and HBA interfaces, making it easy for the IT manager to avail of the above value propositions with minimal disruptions and learning.

InfiniBand optimizes data center productivity in enterprise vertical applications, such as customer relationship management, database, financial services, insurance services, retail, virtualization, cloud computing and web services. InfiniBand-based servers provide data center IT managers with a unique combination of performance and energy-efficiency resulting in a hardware platform that delivers peak productivity, flexibility, scalability and reliability to optimize TCO.

RoCE is an industry standard transport that enables Remote Direct Memory Access (RDMA) to operate on ordinary Ethernet layer 2 and 3 networks.

RDMA is a technology that allows data to be directly transferred between the memory of remote systems, GPUs, and storage without involving the CPUs of those systems. It enables high-speed, low-latency data transfers over a network. In traditional networking, data transfer involves multiple steps, where data is copied from the source system’s memory to the network stack, then sent over the network, and finally copied into the destination system’s memory. RDMA bypasses these intermediate steps, resulting in more efficient data transfers.

RDMA first became widely adopted in the High Performance Computing (HPC) industry with InfiniBand, but is now being leveraged by enterprise Ethernet networks with RoCE (pronounced like “rocky”). Today, with the adoption of GPU computing and large-scale AI use cases within cloud environments, Ethernet can be a practical solution when it is running RoCE.

Given its broad expertise in RDMA technology, the IBTA developed the RoCE standard and released its first specification in 2010. The RoCE standard is defined within the overarching InfiniBand Architecture specification.

InfiniBand was one of the first industry standard specifications to be able to support SDN. An entity called the subnet manager provides traffic routing functionality and enables greater flexibility in how the fabric is architected. The fabric can be built with multiple paths between nodes and, based on the needs of the applications running on it, the subnet manager can determine optimal routes between nodes. Running the subnet manager on a single entity (as opposed to having network management on each switch throughout the fabric) enables the use of simple switches within the fabric with the associated cost savings.

In addition to flexible but efficient traffic routing, the subnet manager enables multiple levels of Quality of Service that guarantee different minimum shares of the available bandwidth. Applications can be configured to place data on lanes that are appropriate to the priority of their data. This gives an InfiniBand-based Software Defined Infrastructure the ability to support a variety of communication needs.

InfiniBand specifications have standardized InfiniBand management infrastructure. InfiniBand fabrics are managed via InfiniBand consoles, and InfiniBand fabric management is expected to snap into existing enterprise management solutions.

The InfiniBand Trade Association has grown from 7 companies to more than 40 since its launch in August 1999. Membership is open to any company, government department or academic institution interested in the development of InfiniBand architecture. To see a list of current trade association members please visit the member roster.

The InfiniBand architecture is complementary to Fibre Channel and Ethernet but offers higher performance and better I/O efficiency than either of these technologies. InfiniBand is uniquely positioned to become the I/O interconnect of choice and is replacing Fibre Channel in many data centers. Ethernet connects seamlessly into the edge of the InfiniBand fabric and benefits from better access to InfiniBand architecture-enabled compute resources. This will enable IT managers to better balance I/O and processing resources within an InfiniBand fabric.

In addition to a board form factor connection, it supports both active and passive copper (for in-rack & neighbor-rack cabling at up to 400 Gb/s), active optical cabling (for connector-free optical links inside data centers to 100 meters), and optical transceivers (for fiber links up to 10km).

The InfiniBand Architecture is capable of supporting tens of thousands of nodes in a single subnet. The scalability is further extended with InfiniBand routers supporting virtually unlimited cluster sizes.

The InfiniBand Architecture

MEMBERS: Go directly to the members download area (login required). In addition to published specifications, members can download, review, and comment on specification in development.

NON-MEMBERS: Please email IBTA Administration (administration@infinibandta.org) for more information.

What’s New in Volume 1 Release 1.8