Skip to main content

Running Large Language Models locally

Ollama is currently a popular options for running LLMs locally. With the newer versions you can download other models than llama too, like Google's Gemma or task-specific smaller models.

You can download Ollama from ollama.com, available for all systems. Again: you need a bit of a beefy computer for this, preferrably with a recent NVIDIA graphics card and quite a bit of storage.

Once installed, ollama disappears to the background. On Windows, you can still see it running among the icons in the lower right of the taskbar.

When installed, open a terminal (win key, type cmd) and run ollama from here.

Ollama starter tips

No idea where to start? Type 

ollama -h

See all installed models by typing

ollama list

Add a model with the following. Replace [model name] with one of the models you can find in the ollama model list.

ollama run [model name]

All models (and instructions, see below) are saved as blob files in C:\Users\[Username]\.ollama\models\blobs. This means you can't manually remove models. To delete a model, type

ollama rm [model name]

Create your own instructions

One of the ways you can modify the model is to give it instructions before it runs conversation mode. Ollama calls this models and modelfiles. Please note that this is not the same as training your own model, just an additional set of instructions to a pre-trained model. The way to do this is to copy the instructions of an existing model (like llama3.2), modify those instructions, and then creating a new model in ollama using those new instructions. For a long description of this process, see here.

Step 1: copy a model file

You can make a copy of an existing model in Ollama by using the following command:

ollama show [modelname] --modelfile > [newname]

where [modelname] is one of the models you have already installed, and [newname] is the filename of the new instructionset you want to create. Where does it save the new file? In the folder that you are currently in while typing the command! On Windows, when opening a terminal this will be C:\Users\[Username] by default. In order to keep everything in one place it's a good idea to navigate to the folder where you want your workfiles to be before running these commands.

We also have two prepared for you here: story and emo. These use the llama3:8b model, which will download automatically when you install and run this 'story' or 'emo' model (see below).

Step 2: modify the model file

Open the model file in a text editor. There's lots of things you can edit here, see the full description on the ollama page. If you only want to change the character of the AI that you are talking with, add a descriptor at "system Job Description:". In the 'story' file, the assistant will write all responses as a short story. With the 'emo' file, the assistant will reply only in one word.

Step 3: install your model

To install your modified model file, type:

ollama create [name] -f [file location]

Under Windows, this often will give an error along the lines: 1 argument expected, 4 given. If that's the case, make sure you're command line is navigated to the folder where the modified model is located, and then use .\[filename] as location. So when you want to save the story file you just edited as a model called 'story', type:

ollama create story -f .\story

Check: when successful, your new model should now show up under ollama list.

All modified models will still use the same pre-trained model file. If all your modified models are based on the llama3.8b model, it will only download that model once (the list command shows all of them being 5 GB, but that is not the size on your disk).

Step 4: run your model
ollama run [modelname]
Step 5: edit your model?

Once installed, you cannot edit a model. You can edit the model file you saved locally, but this will NOT update the model in ollama. To update a model: change the local file, remove the old model, create the new model.