How Cerebras Is Accelerating the Future with AI Compute Power

Jul 16, 2021By William King

TL;DR

Artificial intelligence applications run on algorithmic models fueled by massive datasets. Optimized solutions, such as the latest Wafer Scale Engine from Cerebras, illustrate how the industry is moving beyond conventional computing approaches to meet tomorrow’s AI needs. AI’s impact across countless industries is already transforming the world.

Estimated read time: 7 minutes


Skynet superintelligence and Matrix-enforcing agents make for occasionally great Hollywood sci-fantasy, but here in 2021, artificial intelligence is more about helping humans than terminating them. Businesses would be wise to foster this evolution in every way possible, because the ways in which AI can already improve everyday life are impressively broad and deeply beneficial.

For example, in 2019, Google Health software proved more accurate at spotting breast cancer in mammograms than trained radiologists. At Oregon State University, research scientists currently perform AI-based analysis on recordings from 1500 scattered forest microphones to help the U.S. Forest Service study wildlife behavior and reach conclusions on low-impact logging methods. The GPT3 API can use a selection of input text to create startlingly natural and often usable paragraphs within seconds that authors can plug into their books and articles. Some of this blog might have been written with GPT3 and you would never know.

These advances and countless others illustrate AI’s practical and transformative role in our society—not someday, but right now. AI is fueled by machine learning, a subset of which is deep learning. Deep learning seeks to mimic how the human brain processes data without direct guidance, much like how young children algorithmically tackle tasks from walking to building stable towers from blocks. Children’s data for learning is the physical world around them, while machines learn from massive datasets that, depending on the application, can easily span into petabytes.

Currently, advances in AI and machine learning are limited by computing power. The bigger the dataset and more complex the learning algorithms, the more compute resources and surrounding infrastructure are needed. According to AI systems manufacturer Cerebras, “AI compute demand is doubling every 3.5 months.” The challenge is—and will continue to be—for computing hardware to keep pace with this demand.

Evolving into Optimized ASICs

Again, just as children learn from “training” on real-world inputs along with try/fail experimentation, AI learning stems from training on a given model backed by large datasets. AI training can take many weeks and cost hundreds of thousands of dollars, especially as data scales into petabytes.

Analyzing these massive data loads requires significant computing resources. General-purpose CPUs typically lack the parallelism that AI processing often uses on tasks. This is why GPUs from the likes of NVIDIA that use highly parallelized processing architectures have been central to AI advances over the past decade.

A GPU may be better than a CPU for AI, but that doesn’t mean a GPU is optimal for such tasks. Modern deep learning uses multidimensional data arrays called tensors. One of the most common tools for AI training is the free, open-source software library called TensorFlow. Knowing that AI applications would continue to grow in use and importance, and that much AI training would likely require cloud-based compute resources, Google designed its Tensor Processing Unit (TPU) architecture as a task-optimized, application-specific integrated circuit (ASIC). Today, TPUs run AI-assisted services such as Google Translate and Gmail.

Because ASICs are designed for very specific uses and computation methods, they tend to be the most efficient processor type for niche applications such as AI training. Thus, it’s no surprise that several companies now jockey for position in the ASIC-driven AI space. For example, Graphcore sells its second generation of Colossus Intelligence Processing Units (IPUs). Dell offers a version of its 4U rack-mounted DSS8440 server outfitted with eight PCI Express cards, each of which carries two second-gen IPUs. Note that Dell also offers GPU versions of the DSS8440, illustrating that different processing architectures will likely suit different AI training methods and users will need to buy accordingly.

Cerebras: High-Density AI ASICs

Aiming for an evolution on the ASIC concept, Cerebras developed its Wafer Scale Engine (WSE). Usually, manufacturers divide 300mm semiconductor wafers into individual chips, such as CPUs and GPUs. These chips then integrate into packages (like CPUs), which might then be built onto adapter cards (like graphics cards). Cerebras designed its ASIC architecture from the ground up to make nearly the entire wafer one giant, internetworked processor. Rather than being a multicore processor, the WSE is a multiprocessor wafer. The wafer mounts into a custom-designed machine called a Cerebras System, which is now in its second generation (CS-2). As Cerebras describes it, “At 15 RU, using max system power of 23kW, the CS-2 packs the performance of a room full of servers into a single unit the size of a dorm room mini-fridge.” Many CS-2s can be built into a data center cluster.

For spec comparison against the NVIDIA A1000, reference the table in this IEEE article. It’s eye-popping. While both Cerebras and NVIDIA use a 7nm fabrication process, Cerebras features 1000x the on-chip memory and well over 100x the number of cores. In its white paper, Cerebras claims 1.2 Tb/s of I/O at the edge of a CS-2 cluster. This is very close to the 1.372 Tb/s record set by OpenIO’s object storage challenge.

Which ASIC is “best”? It depends. As Mahmoud Khairy detailed in his amazing Medium review, you can’t judge them on performance alone, if only because the parameters used in running tests can vary between AI platforms. Adopters can examine performance per dollar, per watt, per physical deployment size, and so on. Then there are concerns around how well a given architecture scales as dozens or hundreds of servers are networked. Khairy’s conclusions largely boil down to “it depends,” and he suggests prospective adopters use “efficiency metrics as key measurements rather than relying only on training time.”

That said, Cerebras CEO Andrew Feldman noted in comments to ZDNet that the WSE-2 was a “monstrous jump, and it just continues our absolute domination of the high end" of AI computing.

Opening doors

If your impression of AI is limited to the novelty of trouncing humans on a trivia game show, à la IBM’s Watson on Jeopardy a decade ago, think again. Similarly, we are not on an apocalyptic collision course with AI overlords (probably.) The truth, which lies somewhere in the middle, is that AI is set to impact every aspect of future life and disrupt most industries. The list of AI-influenced fields is nearly as vast as the scope of business itself, but a few examples may help to illustrate the possibilities.

Genomic processing and the other omic fields are frontiers for AI working through gargantuan datasets. Even back in 2015, the PLOS biology journal predicted that by 2025 the amount of data produced for genomics research would outstrip YouTube by up to 20x. In that same year, Expert Biosystems predicted that omics data would explode total storage requirements to 600 exabytes. Much of that omics data depends on AI for acquisition, filtering, and analysis.

Within retail/e-tail, think of Amazon. When the online shopping titan feeds you recommendations for what to buy, that’s AI at work. And getting those purchases to you in the least possible time? It’s AI tackling the back-end logistics.

AI abounds in finance. Ever started a support session with a chatbot? Ever invested in a fund that uses an automated trading system? All AI.

From autonomous cars to agriculture to increasingly autonomous robots, everything is accelerating and becoming more efficient thanks to AI. We mentioned mammogram analysis earlier, but AI in healthcare now goes even further into the fields of disease prediction and preventative care. With robot-assisted telesurgery now a 20-year-old field, you might even wonder if AI might someday take over some surgical procedures entirely. Believe it or not, the first such unaided operation was performed in Italy in 2006. Telesurgery, both conventional and AI-driven, continues to evolve, but network performance constraints are critical. No one can afford packet jitter and lag in such settings.

Creating a new future

The book AI Superpowers describes the rise of AI technologies in the U.S. and China with equal measures of wonder, caution, and hope. The author’s forecasts of impacts on employment show no signs of being wrong, yet great opportunities for regular people wait on the far side of widespread AI adoption. AI is a great power, thus those who wield it carry great responsibility.

Many companies are accelerating their adoption and use of AI applications. Doing so has become virtually essential to remain competitive. However, for AI to deliver on its potential within organizations, bottlenecks must be removed from the underlying infrastructure. At the edge, 5G rollout plays a key role in bottleneck removal. Similarly, the chip architectures described above and its continuing improvement are vital. The networks that carry AI workloads and their massive training sets must let all that data flow without impairment.

We foresee a world where faster connections and optimized processing come together and help create a more connected, effective, and rewarding world for everyone. Accelerating communication is an important step toward this world. Learn more about how Subspace has created a real-time global network that, among many other use cases, will help foster faster AI training and more performant AI-driven applications.

Want to start building on Subspace today? Sign up here.


Related Articles