Cerebras Systems Inc. on Tuesday unveiled a tool for artificial intelligence developers that allows them to run applications using the startup's mega chip, calling it a much cheaper alternative to industry-standard Nvidia processors.
Large AI models that are typically obtained through cloud providers for Nvidia graphics processors (GPUs) to train and deploy for applications such as OpenAI's ChatGPT can be hard to come by and expensive to run, a process developers refer to as inference. They are delivering performance that GPUs can't, are doing it with the highest accuracy, and are offering it at the lowest price.
The inference segment of the AI market is expected to be fast-growing and attractive - ultimately worth tens of billions of dollars if consumers and businesses adopt AI tools.
The Sunnyvale, California-based company plans to offer several types of inference products through developer keys and the cloud. The company will also sell its AI systems to customers who are willing to operate their own data centers.
Cerebras's chips - each the size of a dinner plate and known as wafer-scale engines - avoid a problem with AI data computing: the data for the large-scale modeling required for AI applications typically can't be computed on a single chip, and may require hundreds or thousands of chips strung together. That means Cerebras' chips can achieve faster performance. It plans to charge users 10 cents per million tokens, with tokens being one of the ways the company measures the amount of data output from large models.