Groq’s AI chips are faster than Nvidia’s? AI startup hits the spotlight with ‘lightning-fast’ engine

AI startup Groq (not Elon Musk’s Grok) has unveiled its new artificial intelligence (AI) chip with a Language Processing Unit (LPU) architecture that claims to deliver instantaneous response times. This new innovation comes at a time when AI is witnessing a boom, and companies such as OpenAI, Meta and Google are hard at work developing their suite of AI tools such as Sora, Gemma and more. However, Groq outright claims that it delivers “the world’s fastest large language models.”

Groq claims its LPUs are faster than Nvidia’s Graphics Processing Units (GPUs). Considering that Nvidia has grabbed the spotlight so far in terms of AI chips, this aspect is startling. However, to back that up, Gizmodo reports that the demonstrations made by Groq were “Lightning-fast” and they even made “…current versions of ChatGPT, Gemini and even Grok look sluggish.”

Groq AI chip

The AI chip developed by Groq has specialized processing units that run Large Language Models (LLMs) delivering nearly instantaneous response times. The new novel processing unit, known as Tensor Streaming Processor (TSP), has been classified as an LPU and not a Graphics Processing Unit (GPU). The company says it provides the “fastest inference for computationally intensive applications with a sequential component to them”, such as AI applications or LLMs.

What are the benefits? 

It eliminates the need for complex scheduling hardware and favours a more streamlined approach to processing, the company claims. Groq’s LPU is designed to overcome compute density and memory bandwidth – two problems that plague LLMs. The company says when it comes to LLMs, LPU has a greater compute capacity than a GPU and CPU, thus, reducing the amount of calculation time per word. This results in much faster text generation.

Calling it an “Inference Engine”, the company says its new AI processor supports standard machine learning (ML) frameworks such as PyTorch, TensorFlow, and ONNX for inference. However, its LPU Inference Engine does not currently support Machine Learning (ML) training.

Groq enables faster and more efficient processing, with lower latency and consistent throughput. However, it is not an AI chatbot and is not meant to replace one. Instead, it claims to make them run faster. Those who wish to try Groq can utilize open-source LLMs such as Llama-2 or Mixtral 8x7B.


In a demo shared by HyperWrite CEO Matt Shumer on X, the Groq provided multiple responses to a query, complete with citations in seconds. Another demo of Groq in a side-by-side comparison with GPT-3.5 revealed that it completed the same task as GPT, only nearly 4 times faster. According to benchmarks, Groq can hit almost 500 tokens a second, compared to 30-50 tokens handled by GPT-3.5.

Also read other top stories today:

Demand for Deepfake regulation! Artificial intelligence experts and industry executives, including ‘AI godfather’ Yoshua Bengio, have signed an open letter calling for more regulation around the creation of deepfakes. Some interesting details in this article. Check it out here.

Sora raises fears! Since OpenAI rolled out its text-to-video AI generation platform, leading content creators are fearing if they are the latest professionals about to be replaced by algorithms. Check all the details here.

Microsoft to build a home-grown processor! Microsoft has become a customer of Intel’s made-to-order chip business. The company will use Intel’s 18A manufacturing technology to make a forthcoming chip that the software maker designed in-house. Read all about it here.

One more thing! We are now on WhatsApp Channels! Follow us there so you never miss any updates from the world of technology. ‎To follow the HT Tech channel on WhatsApp, click here to join now!

Leave a Comment