Introducing BOLT2.5B: Unleashing CPU Power in Generative AI

Experience the Revolutionary Generative LLM Pre-Trained Exclusively on CPUs!

Welcome to a new epoch in the world of AI! ThirdAI is elated to present BOLT2.5B, the world’s first Generative LLM with exclusive CPU-only pre-training. Navigate through the evolution and capabilities of BOLT2.5B and witness the future of AI unfold!


Rapid & Efficient Pre-training:

Experience unparalleled pre-training speeds, processing 2 billion tokens daily on just 10 CPU servers.

Dynamic Sparsity

ThirdAI’s innovative technology powers BOLT2.5B, making retraining, fine-tuning and inference dramatically cheap and fast.

Unlocking the Power of CPUs for Generative AI

Thanks to ThirdAI’s ‘dynamic sparsity,’ this model, despite its modest computing resources, was processing an astounding 2 billion tokens daily and has already processed around 40 billion tokens. And this is nearly 160x more efficient than GPT-2 XL. Read the Blog

Train Your Own

  • With ThirdAI’s innovative approach, pre-training a LLM from scratch is now astonishingly simple. Users can harness cloud CPUs or data centers to craft their custom BOLT LLMs with ease.

Fine-tuning Redefined

BOLT2.5B’s ‘dynamic sparse’ architecture allows even older desktops to fine-tune the model, with rates reaching up to 50 tokens per second. For instance the model was fine-tuned on a Shakespearean corpus, achieving one epoch of fine-tuning in just 20 minutes on a single socket 2012 Intel

Swift Inference

Designed with CPUs in mind, BOLT2.5B delivers rapid inference, producing tokens at a rate of 20 per second without the need for specialized treatments like quantization or pruning.


Our current evaluation puts BOLT2.5B’s capabilities on par with OpenAI’s GPT-2 XL model, a widely recognized foundational model boasting 1.5 billion parameters. Notably, the latter was trained using 128 V100s over a span of 10 days, processing 170 billion tokens. On downstream tasks like Classification & Summarization, BOLT is very competitive and on Retrieval, handily beats GPT2

