Summary
When Kafka is briefly unavailable (container rebuild/restart), aiokafka floods logs with Unable to update metadata / Unable connect to node at ~1000 lines/sec. Default retry_backoff_ms=100 and very verbose INFO/ERROR levels.
Reproduced locally: docker compose restart kafka while devices is running → several thousand log lines within seconds.
Changes
- In
KafkaProvider (src/dependencies/kafka.py), pass KafkaBroker(...) with:
retry_backoff_ms=1000
reconnect_backoff_ms=500
reconnect_backoff_max_ms=10000
- In
main.py during logging setup, raise logging.getLogger("aiokafka").setLevel(logging.WARNING) in addition to the app level
Verification
- Start
make dev
docker compose restart kafka → verify devices logs produce at most a few lines per second during the outage window
- After Kafka is back, verify the consumer reconnects and continues processing events
Summary
When Kafka is briefly unavailable (container rebuild/restart),
aiokafkafloods logs withUnable to update metadata/Unable connect to nodeat ~1000 lines/sec. Defaultretry_backoff_ms=100and very verbose INFO/ERROR levels.Reproduced locally:
docker compose restart kafkawhiledevicesis running → several thousand log lines within seconds.Changes
KafkaProvider(src/dependencies/kafka.py), passKafkaBroker(...)with:retry_backoff_ms=1000reconnect_backoff_ms=500reconnect_backoff_max_ms=10000main.pyduring logging setup, raiselogging.getLogger("aiokafka").setLevel(logging.WARNING)in addition to the app levelVerification
make devdocker compose restart kafka→ verifydeviceslogs produce at most a few lines per second during the outage window