GitHub - Humanistic-AI/OllProx: A small proxy for Ollama that supports apikey-based authentication and a redis TTL cache

OllProx

Proxy for an Ollama server that provides authentication and caching

Ollama is great, but it doesn't support authentication nor caching. This small wrapper provides an Api-key based authentication.

To run:

Install nvidia gpu docker support on your host. Instructions are in the Ollama Docker Page. Don't do the run
Create a .env file based on the example

If you already have a set of salted keys, add the local path to the file containing the keys, one per line, to the API_KEY_FILE variable and add the salt to the API_KEY_SALT variable.
If you don't have salted keys already, just plain text ones, keep the API_KEY_SALT variable empty, and a salt will be created for you
If you don't have any API keys already, you can either create some random ones and add them to
Make sure the OLLAMA_HOST and OLLAMA_PORT variables point to your ollama server. The ones included in the example work fine if you launch ollama with default setting on Linux, so far.

Launch with docker compose up
If you didn't provide any keys, check the terminal for a valid API key
After running, launch your models (will pull if needed), e.g. docker exec -it ollama_server run gpt-oss:20b

To call:

You can now call your Ollama's /api/generate endpoint by POSTing to the ollprox container's call_model endpoint. By default, this is mapped to http://localhost:8000/call_model. Request's have the same format as Ollama expects according to the doc. But now you have to add the api key as a header called apikey to the request.

This can be done in CURL with

curl -X POST http://localhost:8000/call_model \
  -H "APIKEY: secretgardenkey" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss:20b",
    "prompt": "What is the capital of France?"
  }'

Or in Python requests with


import requests

response = requests.post(
    "http://localhost:8000/call_model",
    headers={"APIKEY": "secretgardenkey"},
    json={"model": "llama2", "prompt": "What is the capital of France?"}
)

print(response.json())

To modify keys

Just change the key's file
If you revoke a key, it might take up to $KEY_REFRESH$ seconds for it to be invalidated

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
ollprox		ollprox
.env_example		.env_example
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OllProx

Proxy for an Ollama server that provides authentication and caching

To run:

To call:

To modify keys

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OllProx

Proxy for an Ollama server that provides authentication and caching

To run:

To call:

To modify keys

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages