Skip to content

feat(kafka): quiet aiokafka reconnect log spam #6

@MarkCesium

Description

@MarkCesium

Summary

When Kafka is briefly unavailable (container rebuild/restart), aiokafka floods logs with Unable to update metadata / Unable connect to node at ~1000 lines/sec. Default retry_backoff_ms=100 and very verbose INFO/ERROR levels.

Reproduced locally: docker compose restart kafka while devices is running → several thousand log lines within seconds.

Changes

  • In KafkaProvider (src/dependencies/kafka.py), pass KafkaBroker(...) with:
    • retry_backoff_ms=1000
    • reconnect_backoff_ms=500
    • reconnect_backoff_max_ms=10000
  • In main.py during logging setup, raise logging.getLogger("aiokafka").setLevel(logging.WARNING) in addition to the app level

Verification

  • Start make dev
  • docker compose restart kafka → verify devices logs produce at most a few lines per second during the outage window
  • After Kafka is back, verify the consumer reconnects and continues processing events

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions