-
Notifications
You must be signed in to change notification settings - Fork 102
bug: generate_from_raw hanging when ollama under load #650
Description
In the process of testing #567 I encountered an issue where test/core/test_component_typing.py::test_generating was timing out after 15 minutes and failing. Seems like I had overloaded ollama by running for i in {1..10}; do uv run pytest test/core -v; done and the test failed on the fourth or fifth run.
Doing a little bit of digging, it seems like the issue happened here:
mellea/mellea/backends/ollama.py
Line 466 in c4cb59f
| responses = await asyncio.gather(*coroutines, return_exceptions=True) |
Because there's no timeout, if Ollama stalls on any of the concurrent requests (like it seems happened when I ran this test over and over and over again) the entire call hangs indefinitely.
There's a couple of different ways we could fix this, but probably the easiest would be to set a timeout on the underlying ollama.AsyncClient:
mellea/mellea/backends/ollama.py
Line 201 in c4cb59f
| _async_client = ollama.AsyncClient(self._base_url) |
Ideally we'd make the timeout configurable too (so users could set it when creating the backend?)