Maria Deutscher
2024-09-23 17:49:24
siliconangle.com
Broadcom Inc. today debuted a new chip, the Sian2, for powering the high-speed optical networks that underpin artificial intelligence clusters.
The company says that the module provides twice the bandwidth of its predecessor. Additionally, it includes reliability features that prevent errors from finding their way into data while it zips across the network.
The typical large language model runs on not one but multiple servers. Each server hosts a small fragment of the LLM. The model’s fragments must regularly exchange data with one another to coordinate their work, which is only possible if the servers on which they run are linked together by a shared network.
Data center operators often implement the networks that link together their artificial intelligence servers using fiber optic technology. Because light travels faster over glass than electricity over metal, fiber optic cables provide higher data transport speeds than traditional copper wires. That makes the former technology well-suited for bandwidth-intensive AI clusters in which large volumes of data regularly travel between servers.
The graphics card inside an AI server represents data in the form of electrical signals. As a result, the data has to be turned into light before it can be sent over a fiber optic network to another server in the cluster. That task is managed by a specialized networking device called a transceiver. The Sian2, the new chip that Broadcom introduced today, is designed to power data center transceivers.
The Sian2 is a digital signal processor, an integrated circuit optimized to turn data stored as electric signals into light. It’s also capable of performing the reverse operation. When a server connected to an optical network receives a piece of data in the form of light pulses, the Sian2 processor turns the light into electrical signals that the server can understand.
Optical networks are organized into so-called lanes that each process a separate stream of data. Broadcom’s previous-generation transceiver chip, the Sian, can move up to 100 gigabits of data per second through a single lane. The Sian2 can move 200 gigabits per second.
Doubling the amount of bandwidth available per lane halves the number of transceivers needed to build a fiber optic network. That reduction in hardware requirements, in turn, lowers procurement costs. Cutting the number of chips in a network also lowers power requirements and thereby drives further savings.
The Sian2 is made using a five-nanometer manufacturing process. Alongside circuits for turning electrical signals into light and vice versa, it includes a so-called laser driver. That’s a component with a key role in generating the light used by optical networks to move data.
As light beams zip across an AI cluster’s fiber-optic network, they scatter inside the cabling, which causes errors in the data they carry. Those errors can accumulate and interfere with AI models’ processing if they’re not fixed. To address the challenge, the Sian2 chip supports a popular error mitigation technology called FEC.
FEC increases network reliability by transmitting each piece of data multiple times rather than only once. The result is that the server on the other end receives multiple copies of the data. If all the copies are identical, the server can conclude that no errors found their way into the information during transmission and proceed with processing.
The technology is also useful in situations when errors do emerge. Usually, errors take the form of differences between the different copies of a data point that an AI server receives over a fiber-optic connection. Using FEC, the server can take an educated guess about which copies are correct and use them in calculations.
“200G/lane DSP is foundational to high-speed optical links for next generation scale-up and scale-out networks in the AI infrastructure,” said Vijay Janapaty, vice president and general manager of Broadcom’s physical layer products division.
Broadcom is currently sampling the Sian2 to early customers.
Photo: Broadcom
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU
Support Techcratic
If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.
Bitcoin Address:
bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge
Please verify this address before sending funds.
Bitcoin QR Code
Simply scan the QR code below to support Techcratic.
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.