Skip to content

Humanistic-AI/OllProx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OllProx

Proxy for an Ollama server that provides authentication and caching

Ollama is great, but it doesn't support authentication nor caching. This small wrapper provides an Api-key based authentication.

To run:

  1. Install nvidia gpu docker support on your host. Instructions are in the Ollama Docker Page. Don't do the run

  2. Create a .env file based on the example

  • If you already have a set of salted keys, add the local path to the file containing the keys, one per line, to the API_KEY_FILE variable and add the salt to the API_KEY_SALT variable.
  • If you don't have salted keys already, just plain text ones, keep the API_KEY_SALT variable empty, and a salt will be created for you
  • If you don't have any API keys already, you can either create some random ones and add them to
  • Make sure the OLLAMA_HOST and OLLAMA_PORT variables point to your ollama server. The ones included in the example work fine if you launch ollama with default setting on Linux, so far.
  1. Launch with docker compose up

  2. If you didn't provide any keys, check the terminal for a valid API key

  3. After running, launch your models (will pull if needed), e.g. docker exec -it ollama_server run gpt-oss:20b

To call:

You can now call your Ollama's /api/generate endpoint by POSTing to the ollprox container's call_model endpoint. By default, this is mapped to http://localhost:8000/call_model. Request's have the same format as Ollama expects according to the doc. But now you have to add the api key as a header called apikey to the request.

This can be done in CURL with

curl -X POST http://localhost:8000/call_model \
  -H "APIKEY: secretgardenkey" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss:20b",
    "prompt": "What is the capital of France?"
  }'

Or in Python requests with


import requests

response = requests.post(
    "http://localhost:8000/call_model",
    headers={"APIKEY": "secretgardenkey"},
    json={"model": "llama2", "prompt": "What is the capital of France?"}
)

print(response.json())

To modify keys

  • Just change the key's file
  • If you revoke a key, it might take up to $KEY_REFRESH$ seconds for it to be invalidated

About

A small proxy for Ollama that supports apikey-based authentication and a redis TTL cache

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors