Installing Ollama

Ollama can be installed in a variety of ways, and even runs within a Docker container. Ollama will be noticeably quicker when running on a GPU (Nvidia, AMD, Intel), but it can run on CPU and RAM. To install Ollama without any other prerequisites, you can follow their installer:

After their installer completes, if you're on Windows, you should see an entry in the start menu to run it:

Also, you should have access to the ollama CLI via Powershell or CMD:

After Ollama is installed, you can go ahead and pull the models you want to use and run. Here's a command to pull my favorite tool-compatible model and embedding model as of April 2025:

ollama pull llama3.1:8b
ollama pull mxbai-embed-large

Also, you can make sure it's running by going to http://localhost:11434 and you should get the following response (port 11434 being the “normal” Ollama port):

Now that you have Ollama up and running, have a few models pulled, you're ready to go to go ahead and start using Ollama as both a chat provider, and embedding provider!

2.1 KiB Vendored Raw Blame History

Installing Ollama

2.1 KiB

Vendored

Raw Blame History