Announcement: PocketLLM was featured on ProductHunt! | Check out the latest AWS blog on ThirdAI benchmarks 

ThirdAI’s Pocket-LLM: A Completely Free App for AI-Assisted Document Management on Windows and Mac

Convert your text data into a private, searchable, and interactive knowledge-base using the power of LLMs. No hallucinations or data transfer. Runs without internet access.

Download the app from this url:

PocketLLM | Private document search
The power of next-generation deep learning technology is now available on your laptop. With dynamic sparsity, ThirdAI delivers the first Large Language Model for interacting with your text documents (pdf or docx) on your device. The LLM model can be trained, fine-tuned, and deployed on your local machine (Windows or Mac) with complete air-gapped privacy. Yes, the model is trained from scratch just for your needs even without any connection to the internet. Feel free to interact with your local personal documents offline with complete privacy.
PocketLLM App available at
You can double-click on the application and point to a long (thousands of pages is fine!) pdf or a folder containing as many pdfs as you want. Click train, and voila! The LLM will memorize all the pdfs via sparse backpropagation in a few minutes. You can train the default 50 million parameter language model (sufficient for most needs) or even a one billion parameter language model, giving superior results and ideally suited for an extensive collection of documents. Note the 1 billion parameter model will require at least 8GB of RAM to be free. Irrespective of the model size, the model will train in a few minutes thanks to ThirdAI’s pioneering dynamic sparsity-based algorithms.

PocketLLM in Action

Most users are accustomed to entering just one or two keywords in a pdf search bar to have any hope of finding matches. In the screenshot above, you are witnessing a neural search on your machine. Users can take the liberty to describe what they are looking for in as much detail as they like. For example, in the picture above, the model was trained on a 600-page pdf of the Criminal Handbook. It takes a few minutes to train the model on the pdf. Now you can type anything, including long questions as shown above, and get the answer.
Query your documents via questions, keywords, or any terms you think are helpful. You can even issue paragraph-length queries that describe what you are looking for in precise detail. The app will provide you with the most relevant information available within the text. There is no pre-trained model anywhere in our pipeline, so there are no issues with hallucinations. The model only sees the training data that you provide and points you only to the information available.

Another advantage of sparsity is that the neural model can be updated and personalized, on your device in real-time, by playing with the displayed results. Simply click one of the results that you like more than others and hit update. You are updating, or fine-tuning, the models with your new preferences. Each time you hit update the model, a sparse back-propagation algorithm kicks and the complete model get updates. With only a small amount of interaction, you can personalize the model to your taste. In the example above, after typing the query, “What is a no contest plea,” I can pick the response I like the most and hit update. This will update the whole neural model. When you re-hit the discover button, you will see new search results with a new neural model that is updated with your feedback. This way you can personalize the model to any extent you wish.

You also have an option to use the summarize button, which provides a basic summarization for top three answers.

Try it Out!

The current app is free for personal use. The present application does not allow the model to be stored and saved. You need to retrain the model whenever you open the app or change the documents you want to interact with. Don’t worry; the training only takes a few minutes. Currently, the features are limited to information extraction and basic summarization.

This is an alpha release. Stay tuned for more features. Watch out the space

The application can be easily extended to handle millions to billions of documents on-premises or in the cloud. We never require any hardware infrastructure changes, so existing CPUs are more than enough. For commercial use and any other feature request, please reach out to

For Developers: The app is built using our Universal Deep Transformers (UDT) search and embedding model described here. We look forward to seeing what other applications you can build! To get started, apply for a free UDT license here and unlock the power of training billion-parameter models on everyday CPU devices.