Skip to content

Conversation

@sendya
Copy link
Member

@sendya sendya commented Jan 27, 2026

This pull request introduces significant enhancements to the storage tiering and caching system, focusing on tiered storage configuration, new storage interface methods for object promotion/demotion, and improvements to cache key handling and processor structure. It also includes some refactoring for clarity and efficiency, and updates to logging and configuration files.

Storage Tiering and Interface Enhancements:

  • Added new storage tier types (TypeWarm, TypeCold) and clarified the relationship between normal and warm types in indexdb.go.
  • Expanded the Storage and Bucket interfaces to support tiering operations, including methods for promoting/demoting objects between tiers and selecting buckets by tier type. Introduced a Migration interface for handling tier transitions. [1] [2] [3] [4]
  • Updated the storage configuration structs in conf.go to support a new Tiering configuration, including detailed promote/demote policies and per-bucket tiering overrides.
  • Added example tiering configuration to config.example.yaml for easier setup and documentation.

Caching and Processor Improvements:

  • Refactored the ProcessorChain from a slice to a struct, added a touchChan for asynchronous touch operations, and introduced a touchArgs struct for better encapsulation.
  • Updated processor chain iteration to use the new struct format, improving maintainability and clarity. [1] [2]
  • Improved the cache key generation logic by simplifying newObjectIDFromRequest to always return an object ID (never an error), and updated all usages accordingly. This change also removes unnecessary error handling and makes the cache key logic more robust. [1] [2] [3] [4] [5] [6] [7] [8]
  • Added a TryLock method to the Caching struct to support non-blocking locking operations.

Other Improvements and Fixes:

  • Changed the LRU cache Get method to return a value (not a pointer), simplifying usage and preventing potential nil dereferences.
  • Improved logging granularity for chunk proxy operations, changing some logs from Info to Debug for better log level control.
  • Enhanced the setXCache method to include the storage type in the cache status header for better observability.
  • Minor code cleanups, such as removing unnecessary imports and updating test code for new function signatures. [1] [2]

These changes collectively improve the flexibility, maintainability, and observability of the caching and storage subsystem, especially around tiered storage management and cache key handling.

Copilot AI review requested due to automatic review settings January 27, 2026 13:44
@sendya sendya marked this pull request as draft January 27, 2026 13:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements a comprehensive storage tiering and caching enhancement system that introduces warm, hot, and cold storage tiers with automatic object promotion and demotion capabilities based on access patterns.

Changes:

  • Adds storage tier types (TypeWarm, TypeCold) and tiering configuration with promote/demote policies
  • Implements Migration interface for moving objects between tiers, with promotion/demotion logic in storage operations
  • Refactors ProcessorChain to use struct with async touch operations for cache hit tracking and promotion triggers
  • Simplifies cache key generation by making newObjectIDFromRequest always return a value instead of an error
  • Changes LRU cache Get to return values instead of pointers

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
storage/storage.go Implements tiered storage with warm/hot/cold selectors and promotion/demotion logic
storage/marked/wrap_storage.go Adds SelectWithType and Promote wrapper methods; simplifies Buckets()
storage/marked/bucket.go Adds Touch, MoveTo, and SetMigration wrapper methods
storage/builder.go Passes tiering configuration to bucket options
storage/bucket/memory/memory.go Implements Touch, MoveTo, SetMigration; changes log level
storage/bucket/empty/empty.go Adds stub implementations for Touch, MoveTo, SetMigration
storage/bucket/disk/disk.go Implements tiering logic with Touch for promotion, eviction with demotion, MoveTo for migration; changes Exist to use cache
server/middleware/caching/processor.go Refactors ProcessorChain to struct with async touch worker for promotion triggers
server/middleware/caching/locker.go Adds TryLock method for non-blocking lock acquisition
server/middleware/caching/internal.go Simplifies cache key generation; adds storage type to cache status header; refactors chunk logic
server/middleware/caching/caching_vary.go Updates vary processing to use simplified cache key generation
server/middleware/caching/caching_chunkpart_test.go Updates test for new cache key signature
server/middleware/caching/caching.go Changes log level for chunk proxy operations
pkg/algorithm/lru/lru.go Changes Get to return value instead of pointer
config.example.yaml Adds tiering configuration example
conf/conf.go Adds Tiering, Promote, and Demote configuration structs
api/defined/v1/storage/storage.go Adds Touch, MoveTo operations; adds Migration interface; adds SelectWithType and Promote to Storage
api/defined/v1/storage/indexdb.go Adds TypeWarm and TypeCold constants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Exist implements storage.Bucket.
func (d *diskBucket) Exist(ctx context.Context, id []byte) bool {
return d.indexdb.Exist(ctx, id)
return d.cache.Has(object.IDHash(id))
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Exist method implementation has been changed from checking the indexdb to only checking the in-memory LRU cache. This is a significant behavioral change that could cause issues. The cache may not contain all objects that exist in the indexdb, especially during cold starts, after evictions, or when objects are directly added to the indexdb. This could lead to false negatives where Exist returns false for objects that actually exist in storage. The chainSelector in storage.go line 216 relies on Exist to determine if an object is present in a bucket, which could now give incorrect results.

Suggested change
return d.cache.Has(object.IDHash(id))
// Fast path: check in-memory cache first.
if d.cache.Has(object.IDHash(id)) {
return true
}
// Fallback to indexdb as the source of truth to avoid false negatives on cache miss.
md, err := d.indexdb.Get(ctx, id)
if err != nil || md == nil {
return false
}
return true

Copilot uses AI. Check for mistakes.
wrapped = append(wrapped, wrapBucket(b, w.checker))
}
return wrapped
return w.base.Buckets()
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Buckets() method has been simplified to return w.base.Buckets() directly, removing the wrapping logic that was previously applied to each bucket. This means the buckets returned by this method will no longer have the Checker wrapper applied, which is inconsistent with how Select and SelectWithType wrap their returned buckets (lines 35 and 65). This inconsistency could lead to unexpected behavior where buckets obtained through Buckets() don't have expiration marking logic applied, while buckets obtained through Select do.

Suggested change
return w.base.Buckets()
baseBuckets := w.base.Buckets()
if baseBuckets == nil {
return nil
}
wrapped := make([]storagev1.Bucket, 0, len(baseBuckets))
for _, b := range baseBuckets {
wrapped = append(wrapped, wrapBucket(b, w.checker))
}
return wrapped

Copilot uses AI. Check for mistakes.
var targetTier string
switch src.StoreType() {
case storage.TypeCold:
targetTier = storage.TypeWarm
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent tier promotion logic. In the Promote method, when promoting from TypeCold, the target is set to TypeWarm (line 196). However, TypeWarm is semantically equivalent to TypeNormal according to the comments and implementation throughout the codebase. The SelectWithType method treats TypeWarm and TypeNormal as the same (line 155), so this works. However, for consistency and clarity, it would be better to use storage.TypeNormal here since that's what's actually used in the Demote method (line 175) for the reverse operation.

Suggested change
targetTier = storage.TypeWarm
targetTier = storage.TypeNormal

Copilot uses AI. Check for mistakes.
Comment on lines 188 to +209
if index < len(availableChunks) {

chunkFile, _ := getSliceChunkFile(c, availableChunks[index])
if chunkFile != nil {
if err := checkChunkSize(c, chunkFile, idx); err != nil {
_ = c.bucket.Discard(context.Background(), c.id)
return nil, 0, err
}
}

// 找到一个起始块,需要补齐前面的缺失块到当前这个块
toByte := min(c.md.Size-1, uint64(availableChunks[index]*uint32(partSize))-1)
req := c.req.Clone(context.Background())
newRange := fmt.Sprintf("bytes=%d-%d", fromByte, toByte)
req.Header.Set("Range", newRange)
// 找到一个起始块,需要补齐前面的缺失块到当前这个块
toByte := min(c.md.Size-1, uint64(availableChunks[index]*uint32(partSize))-1)
req := c.req.Clone(context.Background())
newRange := fmt.Sprintf("bytes=%d-%d", fromByte, toByte)
req.Header.Set("Range", newRange)

reader, err := c.getUpstreamReader(fromByte, toByte, true)
if err != nil {
return nil, 0, err
}
reader, err := c.getUpstreamReader(fromByte, toByte, true)
if err != nil {
return nil, 0, err
}

return iobuf.PartsReadCloser(reader, chunkFile), int(availableChunks[index]-reqChunks[from]) + 1, nil
return iobuf.PartsReadCloser(reader, chunkFile), int(availableChunks[index]-reqChunks[from]) + 1, nil
}
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code formatting change here removes the closing brace and changes the indentation, but the logic appears incomplete. The if chunkFile != nil block starting at line 191 doesn't have a proper else clause anymore. Looking at the original code flow, if chunkFile is nil, the code should still execute the "fill" logic at the bottom (now starting at line 212). However, the restructured code makes this flow less clear. Consider restructuring this to make the control flow more explicit and easier to follow.

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +221
go func() {
for args := range pc.touchChan {
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
if err := args.bucket.Touch(ctx, args.id); err != nil {
log.Warnf("failed to touch object %s: %v", args.id.String(), err)
}
args.unlock()
cancel()
}
}()
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double goroutine launch in startWorker. The method itself is called with 'go pc.startWorker()' on line 205, which makes it run in a goroutine. Then inside startWorker on line 212, there's another 'go func()' that launches the actual worker. This creates an unnecessary extra goroutine that immediately exits. Either remove the 'go' from line 205 or remove the 'go func()' wrapper on line 212.

Suggested change
go func() {
for args := range pc.touchChan {
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
if err := args.bucket.Touch(ctx, args.id); err != nil {
log.Warnf("failed to touch object %s: %v", args.id.String(), err)
}
args.unlock()
cancel()
}
}()
for args := range pc.touchChan {
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
if err := args.bucket.Touch(ctx, args.id); err != nil {
log.Warnf("failed to touch object %s: %v", args.id.String(), err)
}
args.unlock()
cancel()
}

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +112
MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 >= N
Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m
MaxObjectSize uint64 `json:"max_object_size" yaml:"max_object_size"` // 最大对象大小
}
Demote struct {
MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 <= N
Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m
Occupancy float64 `json:"occupancy" yaml:"occupancy"` // 热盘存储占用率 >= N%
}
Tiering struct {
Enabled bool `json:"enabled" yaml:"enabled"`
Promote Promote `json:"promote" yaml:"promote"` // 升温
Demote Demote `json:"demote" yaml:"demote"` // 降温
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 100 contains Chinese text "时间窗口内命中 >= N" which translates to "hits within time window >= N". For consistency and to make the codebase more accessible to all contributors, consider using English for comments. The same issue appears on lines 101, 102, 105, 106, and 107.

Suggested change
MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 >= N
Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m
MaxObjectSize uint64 `json:"max_object_size" yaml:"max_object_size"` // 最大对象大小
}
Demote struct {
MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 <= N
Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m
Occupancy float64 `json:"occupancy" yaml:"occupancy"` // 热盘存储占用率 >= N%
}
Tiering struct {
Enabled bool `json:"enabled" yaml:"enabled"`
Promote Promote `json:"promote" yaml:"promote"` // 升温
Demote Demote `json:"demote" yaml:"demote"` // 降温
MinHits int `json:"min_hits" yaml:"min_hits"` // hits within the time window >= N
Window time.Duration `json:"window" yaml:"window"` // time window (for example, 10m)
MaxObjectSize uint64 `json:"max_object_size" yaml:"max_object_size"` // maximum object size
}
Demote struct {
MinHits int `json:"min_hits" yaml:"min_hits"` // hits within the time window <= N
Window time.Duration `json:"window" yaml:"window"` // time window (for example, 10m)
Occupancy float64 `json:"occupancy" yaml:"occupancy"` // hot storage disk occupancy >= N%
}
Tiering struct {
Enabled bool `json:"enabled" yaml:"enabled"`
Promote Promote `json:"promote" yaml:"promote"` // promote (move to a hotter tier)
Demote Demote `json:"demote" yaml:"demote"` // demote (move to a colder tier)

Copilot uses AI. Check for mistakes.
func (w *wrappedStorage) Promote(ctx context.Context, id *object.ID, src storagev1.Bucket) error {
return w.base.Promote(ctx, id, src)
}

Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Demote method implementation in wrappedStorage. While the Promote method is implemented on line 68-70, there is no corresponding Demote method. This is inconsistent with the Migration interface which requires both Promote and Demote methods (as defined in api/defined/v1/storage/storage.go lines 94-96). This will cause compilation errors or runtime failures when Demote is called on a wrappedStorage instance.

Suggested change
func (w *wrappedStorage) Demote(ctx context.Context, id *object.ID, src storagev1.Bucket) error {
return w.base.Demote(ctx, id, src)
}

Copilot uses AI. Check for mistakes.
Comment on lines +197 to +198
case storage.TypeWarm:
targetTier = storage.TypeHot
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promote method case for TypeWarm is unreachable. The TypeNormal case on line 176 in the Demote method handles both TypeNormal and TypeWarm together, but in the Promote method on line 197, TypeWarm has its own case. However, buckets are categorized into normalBucket for both TypeNormal and TypeWarm (line 87), and their StoreType() will return whatever was configured. If a bucket is configured as "warm" type, this case would be reachable. But the inconsistency in how TypeNormal is handled (with a comment noting TypeWarm is the same) versus having a separate case for TypeWarm creates confusion. Consider either handling both together or ensuring the pattern is consistent between Promote and Demote.

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +216
// Touch implements [storage.Bucket].
func (m *memoryBucket) Touch(ctx context.Context, id *object.ID) error {
mark := m.cache.Get(id.Hash())
if mark.LastAccess() <= 0 {
return nil
}

mark.SetLastAccess(time.Now().Unix())
mark.SetRefs(mark.Refs() + 1)
m.cache.Set(id.Hash(), mark)
return nil
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Touch implementation for memory bucket doesn't check if the object exists in indexdb before touching. Unlike the disk bucket implementation which validates the object exists, this implementation directly gets from cache and modifies it. If mark.LastAccess() is 0 or less, it returns early, but this check is insufficient. If the cache entry doesn't exist, Get() will return a zero-value Mark, and the code will proceed to set incorrect values into the cache. Consider adding validation to ensure the object actually exists in the indexdb before updating the cache.

Copilot uses AI. Check for mistakes.
}

resp.Header.Set(constants.ProtocolCacheStatusKey, strings.Join([]string{c.cacheStatus.String(), "from", c.opt.Hostname, "(tavern/4.0)"}, " "))
resp.Header.Set(constants.ProtocolCacheStatusKey, strings.Join([]string{c.cacheStatus.String(), "from", c.opt.Hostname, c.bucket.StoreType(), "(tavern/4.0)"}, " "))
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setXCache method will panic if c.bucket is nil. The method attempts to call c.bucket.StoreType() on line 94 without checking if bucket is nil. While this might be protected by callers, the method should be defensive and handle the nil case, especially since this is called in the response path where stability is critical.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants