Running large language models locally
Model Shop
Ollama and Open WebUI let you join the AI revolution without relying on the cloud.
Large language models (LLMs) such as the ones used by OpenAI's [1] ChatGPT [2] are too resource intensive to run locally on your own computer. That's why they're deployed as online services that you pay for. However, since ChatGPT's release, some significant advancements have occurred around smaller LLMs. Many of these smaller LLMs are open source or have a liberal license (see the "Licenses" box). You can run them on your own computer without having to send your input to a cloud server and without having to pay a fee to an online service.
Because these LLMs are computationally intensive and need a lot of RAM, running them on your CPU can be slow. For optimal performance, you need a GPU – GPUs have many parallel compute cores and a lot of dedicated RAM. An NVIDIA or AMD GPU with 8GB RAM or more is recommended.
In addition to the hardware and the models, you also need software that enables you to run the models. One popular package is Ollama [3], named for Meta AI's large language model Llama [4]. Ollama is a command-line application that runs on Linux, macOS, and Windows, and you can also run it as a server that other software connects to.
[...]
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.
-
New Steam Client Ups the Ante for Linux
The latest release from Steam has some pretty cool tricks up its sleeve.
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.
-
Fedora 41 Released with New Features
If you're a Fedora fan or just looking for a Linux distribution to help you migrate from Windows, Fedora 41 might be just the ticket.
-
AlmaLinux OS Kitten 10 Gives Power Users a Sneak Preview
If you're looking to kick the tires of AlmaLinux's upstream version, the developers have a purrfect solution.
-
Gnome 47.1 Released with a Few Fixes
The latest release of the Gnome desktop is all about fixing a few nagging issues and not about bringing new features into the mix.
-
System76 Unveils an Ampere-Powered Thelio Desktop
If you're looking for a new desktop system for developing autonomous driving and software-defined vehicle solutions. System76 has you covered.