Experience the Revolutionary Generative LLM Pre-Trained Exclusively on CPUs!
Welcome to a new epoch in the world of AI! ThirdAI is elated to present BOLT2.5B, the world’s first Generative LLM with exclusive CPU-only pre-training. Navigate through the evolution and capabilities of BOLT2.5B and witness the future of AI unfold!
Experience unparalleled pre-training speeds, processing 2 billion tokens daily on just 10 CPU servers.
ThirdAI’s innovative technology powers BOLT2.5B, making retraining, fine-tuning and inference dramatically cheap and fast.
Thanks to ThirdAI’s ‘dynamic sparsity,’ this model, despite its modest computing resources, was processing an astounding 2 billion tokens daily and has already processed around 40 billion tokens. And this is nearly 160x more efficient than GPT-2 XL. Read the Blog
BOLT2.5B’s ‘dynamic sparse’ architecture allows even older desktops to fine-tune the model, with rates reaching up to 50 tokens per second. For instance the model was fine-tuned on a Shakespearean corpus, achieving one epoch of fine-tuning in just 20 minutes on a single socket 2012 Intel
Designed with CPUs in mind, BOLT2.5B delivers rapid inference, producing tokens at a rate of 20 per second without the need for specialized treatments like quantization or pruning.
Our current evaluation puts BOLT2.5B’s capabilities on par with OpenAI’s GPT-2 XL model, a widely recognized foundational model boasting 1.5 billion parameters. Notably, the latter was trained using 128 V100s over a span of 10 days, processing 170 billion tokens. On downstream tasks like Classification & Summarization, BOLT is very competitive and on Retrieval, handily beats GPT2
ThirdAI is on a mission to make sophisticated large language models (LLMs) and other cutting-edge AI technologies accessible for everyone. Our goal is to build customized, private AI that is trained on commodity hardware with ultra-low latency inference for every organization.
USEFUL LINKS
Contact