KeplerFS

A leaderless, distributed peer-to-peer file system. Every node runs the same binary — acting simultaneously as HTTP gateway, gossip peer, and chunk store. No master. No single point of failure.

[Browser UI] ──HTTP──► [Node A] ◄──gossip/UDP──► [Node B] ◄──► [Node C]
                            │                          │
                       local chunks              local chunks

Features

Masterless — all nodes are peers; any node can accept reads and writes
Content-addressed storage — every chunk is identified by its SHA-256 digest, giving deduplication and integrity verification for free
Consistent hash ring — deterministic, low-disruption chunk placement across the cluster
SWIM-inspired gossip — automatic peer discovery and failure detection over UDP
Crash-safe replication — SQLite-backed replication queue ensures at-least-once delivery after restarts
Atomic writes — tmp → fsync → rename sequence prevents torn chunks on disk
Configurable durability — choose between quorum (strong) and fast (async) write acknowledgement per request

Architecture

System Overview

Each node exposes three interfaces simultaneously:

Interface	Protocol	Purpose
HTTP gateway	TCP/HTTP	Accepts client uploads and downloads
Gossip peer	UDP	Peer discovery and failure detection
Chunk transfer	TCP	Direct chunk streaming between nodes

How a File Write Works

Chunking & Hashing — The receiving node splits the file into fixed-size chunks (default 4 MB). Each chunk is content-addressed with SHA-256, providing deduplication and integrity verification for free.
Consistent Hash Ring — Every node holds a position on a consistent hash ring derived from its node ID. Each chunk is mapped to the N closest nodes clockwise (N = replication factor, default 3). The receiving node immediately knows which chunks are local and which must be forwarded.
Replication Pipeline — Local chunks are written atomically (tmp → fsync → rename). Remote chunks are enqueued in SQLite and streamed to target nodes over TCP. The client receives 200 only after local writes + (N−1) remote ACKs (configurable via ?durability=quorum|fast).
Gossip — Nodes discover each other and maintain the ring via a SWIM-inspired UDP gossip protocol. Missed heartbeats promote peers to suspect, then dead. Dead peers trigger automatic re-replication of their chunks across surviving nodes.
Rebalancing on Join — When a new node joins, its clockwise neighbour detects it now owns fewer chunks and streams the handoff set in the background. The ring is updated immediately; handoff happens asynchronously with a grace-period deletion timer.

Monorepo Structure

KeplerFS/
├── apps/
│   └── frontend/           # SvelteKit 5 web UI
└── packages/
    └── node/               # Bun P2P node binary

Package	Description
`packages/node`	HTTP gateway, gossip peer, chunk store
`apps/frontend`	Upload, download, and cluster visualisation UI

Getting Started

Prerequisites

Tool	Version	Purpose
Bun	≥ 1.2	Runtime and compiler for `packages/node`
pnpm	≥ 10	Monorepo package manager

Install Dependencies

pnpm install

Run a Local 3-Node Cluster

Open three terminals and run each command in its own session:

# Terminal 1 — seed node
cd packages/node
NODE_ID=node-a HTTP_PORT=3001 GOSSIP_PORT=4001 TRANSFER_PORT=5001 \
  pnpm run dev

# Terminal 2
NODE_ID=node-b HTTP_PORT=3002 GOSSIP_PORT=4002 TRANSFER_PORT=5002 \
  SEEDS=127.0.0.1:4001 pnpm run dev

# Terminal 3
NODE_ID=node-c HTTP_PORT=3003 GOSSIP_PORT=4003 TRANSFER_PORT=5003 \
  SEEDS=127.0.0.1:4001 pnpm run dev

The UI (once the frontend dev server is running) is available at http://localhost:5173.

Compile to a Single Binary

cd packages/node
pnpm run build
# Output: dist/kepler-node

Run Tests

cd packages/node
bun test

Configuration

All configuration is supplied via environment variables.

Variable	Default	Description
`NODE_ID`	required	Unique identifier for this node
`HTTP_PORT`	`3000`	Port for the HTTP gateway
`GOSSIP_PORT`	`4000`	UDP port for gossip messages
`TRANSFER_PORT`	`5000`	TCP port for chunk transfer
`SEEDS`	—	Comma-separated `host:gossip_port` of known peers
`REPLICATION_FACTOR`	`3`	Number of nodes each chunk is replicated to
`CHUNK_SIZE_MB`	`4`	Maximum chunk size in megabytes
`REBALANCE_BANDWIDTH_MBPS`	`50`	Rebalance bandwidth cap per node (MB/s)

Data Model

Each node maintains a local SQLite database with the following schema:

-- Chunks stored on this node
chunks(
  id           TEXT PRIMARY KEY,
  file_id      TEXT,
  index        INT,
  size         INT,
  path         TEXT,
  created_at   INT
)

-- File manifests propagated via gossip
files(
  id           TEXT PRIMARY KEY,
  name         TEXT,
  size         INT,
  chunk_ids    JSON,
  owner_node   TEXT,
  created_at   INT
)

-- Known peers and their health status
peers(
  id            TEXT PRIMARY KEY,
  addr          TEXT,
  gossip_port   INT,
  transfer_port INT,
  last_seen     INT,
  status        TEXT        -- 'alive' | 'suspect' | 'dead'
)

-- Crash-safe outbound replication queue
replication_queue(
  chunk_id       TEXT,
  target_node_id TEXT,
  attempts       INT,
  last_attempt   INT,
  PRIMARY KEY (chunk_id, target_node_id)
)

Fault Tolerance

Problem	Mitigation
Split brain	AP design — accept divergence, reconcile on merge; fencing tokens on writes
Thundering herd on rejoin	Per-node rebalance bandwidth cap (default 50 MB/s)
Chunk transfer atomicity	Write to `*.tmp`, fsync, then `rename(2)`
Crash mid-replication	`replication_queue` table; pipeline resumes on restart
Node failure	Gossip detects dead peers and triggers automatic re-replication

Contributing

Contributions are welcome. Please open an issue to discuss significant changes before submitting a pull request.

Fork the repository
Create a feature branch: git checkout -b feat/your-feature
Commit your changes following the existing code conventions
Open a pull request with a clear description of what was changed and why

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.github		.github
apps/frontend		apps/frontend
packages/node		packages/node
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KeplerFS

Table of Contents

Features

Architecture

System Overview

How a File Write Works

Monorepo Structure

Getting Started

Prerequisites

Install Dependencies

Run a Local 3-Node Cluster

Compile to a Single Binary

Run Tests

Configuration

Data Model

Fault Tolerance

Contributing

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

KeplerFS

Table of Contents

Features

Architecture

System Overview

How a File Write Works

Monorepo Structure

Getting Started

Prerequisites

Install Dependencies

Run a Local 3-Node Cluster

Compile to a Single Binary

Run Tests

Configuration

Data Model

Fault Tolerance

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages