AI - How To Run Llama 2 Locally
There are advantages to integrating with large open models. But there are also downsides. How do you control costs? What to do about data security? How do you control model versioning to avoid unexpected regressions? Running your own LLM model on your local machine or your own server is an effective class of solution to these types of engineering challenges.
Integrating with the OpenAI API is reasonably simple and there are many tutorials on how to do this. But, if you want to run a local model, it’s harder to find the right on-ramps. Here’s a simple guide to running Llama 2 on your computer.
Running Llama 2 Locally
Here are the 4 easy steps to running locally:
- Install Ollama
- Download the Llama 2 model
- Serve the model using the Ollama CLI
- Integrate with the model’s REST API
1. Install Ollama
Download the latest version of the Ollama CLI.
Install it on your computer.
The installer ought to add ollama
to your PATH variable. You can confirm that the CLI is installed correctly by running ollama -v
from your terminal.
2. Download the Llama 2 Model (or other Pre-trained model)
Browse the Ollama Model Librart and select your preferred model.
For Llama 2 7B (the compact model) the model id is llama2
Pull your model by running terminal command ollama pull llama2
Wait until the model is fully downloaded locally.
3. Serve the Model
Run terminal command ollama serve
4. Integrate with the model’s REST API
Use curl or your favorite HTTP/REST client.
curl http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt": "Say Hello World and nothing else",
"stream": false
}'
If everything is running correctly, you should get back a JSON response body with a response
value saying “Hello World!”
For more integration learnings, refer to the API Documentation
Summary
Running Llama 2 locally is a straightforward process that opens up a world of possibilities for developers and AI enthusiasts alike. Happy coding, and may your local AI endeavors lead to innovative and exciting outcomes!
Did you get stuck somewhere? Got another question? Comment below.