Skip to content

Conversation

@mrzeszutko
Copy link
Contributor

@mrzeszutko mrzeszutko commented Dec 22, 2025

Summary

  • Add comprehensive blob storage documentation for node operators
  • Rename BLOB_SINK_ARCHIVE_API_URLBLOB_ARCHIVE_API_URL (cleanup after BlobSink removal)
  • Remove dead environment variables BLOB_SINK_PORT and BLOB_SINK_URL

Description

Following the removal of the BlobSink HTTP server (#19143), this PR:

  1. Adds new documentation (blob_storage.md) explaining how Aztec nodes store and retrieve blob data, including:

    • Overview of blob sources (FileStore, L1 Consensus, Archive API)
    • PeerDAS and supernode requirements for L1 consensus
    • Configuration examples for GCS, S3, and Cloudflare R2
    • Authentication setup
    • Troubleshooting guide
  2. Adds blob upload documentation (blob_upload.md) for node operators who want to contribute to the network by hosting a blob file store, including:

    • Upload configuration with BLOB_FILE_STORE_UPLOAD_URL
    • How to expose public HTTP endpoints for GCS, S3, and R2
    • Authentication with write permissions
  3. Cleans up legacy naming by renaming BLOB_SINK_ARCHIVE_API_URL to BLOB_ARCHIVE_API_URL - the "sink" terminology is no longer accurate since the HTTP server was removed

  4. Removes dead code - BLOB_SINK_PORT and BLOB_SINK_URL env vars that were left behind after BlobSink removal


Fixes A-389

@mrzeszutko mrzeszutko changed the title Blob storage documentation docs: blob storage documentation Dec 22, 2025
@mrzeszutko mrzeszutko marked this pull request as ready for review December 22, 2025 21:07
Copy link
Contributor

@spalladino spalladino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's split the instructions related to retrieval and storage, since they are meant for different users. Also let's please delete the generic or redundant instructions inserted by Claude, like "search logs for troubleshooting".

@mrzeszutko mrzeszutko force-pushed the feature/blob-storage-docs branch from 2078cd3 to 3c4be61 Compare January 5, 2026 12:12

### Environment variables

Configure blob sources using environment variables in your `.env` file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Configure blob sources using environment variables in your `.env` file:
Configure blob sources using environment variables:

Unless we're elsewhere making a point to use .env files to run an Aztec node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

### Cloudflare R2 configuration

```bash
BLOB_FILE_STORE_URLS=s3://my-bucket/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the proper R2 config for a consumer from cloudflare? Seems weird it goes through S3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based it on syncing best practices description from docs and actually tested that it works (just tested parts of the code on my own R2), but I cannot find any information about this format in cloudflare docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloudflare's API is S3-compatible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ngl, looks odd. @spypsy can you confirm? I'm worried the other docs may also be wrong.

Comment on lines 125 to 135
### Docker Compose integration

Add the environment variables to your `docker-compose.yml`:

```yaml
environment:
# ... other environment variables
BLOB_FILE_STORE_URLS: ${BLOB_FILE_STORE_URLS}
L1_CONSENSUS_HOST_URLS: ${L1_CONSENSUS_HOST_URLS}
BLOB_ARCHIVE_API_URL: ${BLOB_ARCHIVE_API_URL}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not go into different ways to configure env vars here. We should just explain the env vars to set, and let the operator decide how to handle them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 14 to 20
:::tip Automatic Configuration
When using `--network [NETWORK_NAME]`, blob sources are automatically configured for you. Most users don't need to manually configure blob storage.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also clarify what happens if the user manually sets one of the env vars: they end up replacing the blob sources, not appending to the ones automatically configured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a warning with clarification, additionally updated the initial tip to mention that only filestore sources are configured on the network level

BLOB_ARCHIVE_API_URL: ${BLOB_ARCHIVE_API_URL}
```

## Authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this whole section only relevant for upload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so - depends on the storage configuration but you might need read access keys/permissions to read the data, only HTTP access should always work without any additional permissions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only HTTP access should always work without any additional permissions

My guess is most endpoints (S3, R2, whatever it is) will just be accessed via public http, that's why I was pushing for removing this section. But I guess we can leave it in in case someone wants to set up a permissioned repository.


- **Google Cloud Storage** - `gs://bucket-name/path/`
- **Amazon S3** - `s3://bucket-name/path/`
- **Cloudflare R2** - `s3://bucket-name/path/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question on R2 URL

Copy link
Contributor Author

@mrzeszutko mrzeszutko Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer - these are urls already used in the docs and they do work

BLOB_FILE_STORE_UPLOAD_URL=file:///data/blobs
```

### Docker Compose integration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in the other doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a section on the .healthcheck file, and request to exclude it from any pruning policies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@mrzeszutko mrzeszutko force-pushed the feature/blob-storage-docs branch from 3c4be61 to 66e0fdc Compare January 7, 2026 11:09
@mrzeszutko mrzeszutko requested a review from spalladino January 7, 2026 11:10
### Cloudflare R2 configuration

```bash
BLOB_FILE_STORE_URLS=s3://my-bucket/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ngl, looks odd. @spypsy can you confirm? I'm worried the other docs may also be wrong.

BLOB_ARCHIVE_API_URL: ${BLOB_ARCHIVE_API_URL}
```

## Authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only HTTP access should always work without any additional permissions

My guess is most endpoints (S3, R2, whatever it is) will just be accessed via public http, that's why I was pushing for removing this section. But I guess we can leave it in in case someone wants to set up a permissioned repository.

@spalladino spalladino added this pull request to the merge queue Jan 7, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 7, 2026
@spalladino
Copy link
Contributor

@mrzeszutko heads up this failed due to spellchecks

@mrzeszutko mrzeszutko force-pushed the feature/blob-storage-docs branch from 66e0fdc to 556a9a9 Compare January 8, 2026 08:28
@mrzeszutko mrzeszutko added this pull request to the merge queue Jan 8, 2026
@mrzeszutko
Copy link
Contributor Author

@mrzeszutko heads up this failed due to spellchecks

@spalladino I added missing new words to docs-words.txt

Merged via the queue into next with commit 651732d Jan 8, 2026
16 checks passed
@mrzeszutko mrzeszutko deleted the feature/blob-storage-docs branch January 8, 2026 09:08
@spypsy
Copy link
Member

spypsy commented Jan 8, 2026

bit late but just to confirm @spalladino that format does look correct and it's what we use currently for snapshot uploading to our bucket. we also have a &publicBaseUrl=https://aztec-labs-snapshots.com at the end but don't think it's needed for uploading

AztecBot pushed a commit that referenced this pull request Jan 9, 2026
## Summary
- Add `docs-network`, `docs-developers`, `developer_versioned_docs`, and `network_versioned_docs` to `.rebuild_patterns` so that changes to markdown files in these directories invalidate the test cache for spellcheck

## Problem
PR #19194 passed spellcheck on the PR but failed on merge queue. This happened because:
1. The spellcheck test cache hash is computed from `.rebuild_patterns`
2. `.rebuild_patterns` didn't include the docs content directories (`docs-network/`, etc.)
3. On PR: hash didn't change → cache hit → spellcheck "passed" using stale cached result
4. On merge queue: `USE_TEST_CACHE=0` → spellcheck ran fresh → found spelling errors

## Solution
Add the missing docs content directories to `.rebuild_patterns` so any markdown changes invalidate the cache and spellcheck runs fresh.

## Test plan
- [x] Verify `.rebuild_patterns` syntax is correct
- [ ] CI should run spellcheck with the new hash

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Josh Crites <[email protected]>
AztecBot pushed a commit that referenced this pull request Jan 9, 2026
## Summary
- Add `docs-network`, `docs-developers`, `developer_versioned_docs`, and `network_versioned_docs` to `.rebuild_patterns` so that changes to markdown files in these directories invalidate the test cache for spellcheck

## Problem
PR #19194 passed spellcheck on the PR but failed on merge queue. This happened because:
1. The spellcheck test cache hash is computed from `.rebuild_patterns`
2. `.rebuild_patterns` didn't include the docs content directories (`docs-network/`, etc.)
3. On PR: hash didn't change → cache hit → spellcheck "passed" using stale cached result
4. On merge queue: `USE_TEST_CACHE=0` → spellcheck ran fresh → found spelling errors

## Solution
Add the missing docs content directories to `.rebuild_patterns` so any markdown changes invalidate the cache and spellcheck runs fresh.

## Test plan
- [x] Verify `.rebuild_patterns` syntax is correct
- [ ] CI should run spellcheck with the new hash

🤖 Generated with [Claude Code](https://claude.com/claude-code)
github-merge-queue bot pushed a commit that referenced this pull request Jan 10, 2026
## Summary
- Add `docs-network`, `docs-developers`, `developer_versioned_docs`, and
`network_versioned_docs` to `.rebuild_patterns` so that changes to
markdown files in these directories invalidate the test cache for
spellcheck

## Problem
PR #19194 passed spellcheck on the PR but failed on merge queue. This
happened because:
1. The spellcheck test cache hash is computed from `.rebuild_patterns`
2. `.rebuild_patterns` didn't include the docs content directories
(`docs-network/`, etc.)
3. On PR: hash didn't change → cache hit → spellcheck "passed" using
stale cached result
4. On merge queue: `USE_TEST_CACHE=0` → spellcheck ran fresh → found
spelling errors

## Solution
Add the missing docs content directories to `.rebuild_patterns` so any
markdown changes invalidate the cache and spellcheck runs fresh.

## Test plan
- [x] Verify `.rebuild_patterns` syntax is correct
- [ ] CI should run spellcheck with the new hash

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants