-
Notifications
You must be signed in to change notification settings - Fork 3
[WIP] feat: warm cold levels cache #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request implements a comprehensive storage tiering and caching enhancement system that introduces warm, hot, and cold storage tiers with automatic object promotion and demotion capabilities based on access patterns.
Changes:
- Adds storage tier types (TypeWarm, TypeCold) and tiering configuration with promote/demote policies
- Implements Migration interface for moving objects between tiers, with promotion/demotion logic in storage operations
- Refactors ProcessorChain to use struct with async touch operations for cache hit tracking and promotion triggers
- Simplifies cache key generation by making newObjectIDFromRequest always return a value instead of an error
- Changes LRU cache Get to return values instead of pointers
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| storage/storage.go | Implements tiered storage with warm/hot/cold selectors and promotion/demotion logic |
| storage/marked/wrap_storage.go | Adds SelectWithType and Promote wrapper methods; simplifies Buckets() |
| storage/marked/bucket.go | Adds Touch, MoveTo, and SetMigration wrapper methods |
| storage/builder.go | Passes tiering configuration to bucket options |
| storage/bucket/memory/memory.go | Implements Touch, MoveTo, SetMigration; changes log level |
| storage/bucket/empty/empty.go | Adds stub implementations for Touch, MoveTo, SetMigration |
| storage/bucket/disk/disk.go | Implements tiering logic with Touch for promotion, eviction with demotion, MoveTo for migration; changes Exist to use cache |
| server/middleware/caching/processor.go | Refactors ProcessorChain to struct with async touch worker for promotion triggers |
| server/middleware/caching/locker.go | Adds TryLock method for non-blocking lock acquisition |
| server/middleware/caching/internal.go | Simplifies cache key generation; adds storage type to cache status header; refactors chunk logic |
| server/middleware/caching/caching_vary.go | Updates vary processing to use simplified cache key generation |
| server/middleware/caching/caching_chunkpart_test.go | Updates test for new cache key signature |
| server/middleware/caching/caching.go | Changes log level for chunk proxy operations |
| pkg/algorithm/lru/lru.go | Changes Get to return value instead of pointer |
| config.example.yaml | Adds tiering configuration example |
| conf/conf.go | Adds Tiering, Promote, and Demote configuration structs |
| api/defined/v1/storage/storage.go | Adds Touch, MoveTo operations; adds Migration interface; adds SelectWithType and Promote to Storage |
| api/defined/v1/storage/indexdb.go | Adds TypeWarm and TypeCold constants |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Exist implements storage.Bucket. | ||
| func (d *diskBucket) Exist(ctx context.Context, id []byte) bool { | ||
| return d.indexdb.Exist(ctx, id) | ||
| return d.cache.Has(object.IDHash(id)) |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Exist method implementation has been changed from checking the indexdb to only checking the in-memory LRU cache. This is a significant behavioral change that could cause issues. The cache may not contain all objects that exist in the indexdb, especially during cold starts, after evictions, or when objects are directly added to the indexdb. This could lead to false negatives where Exist returns false for objects that actually exist in storage. The chainSelector in storage.go line 216 relies on Exist to determine if an object is present in a bucket, which could now give incorrect results.
| return d.cache.Has(object.IDHash(id)) | |
| // Fast path: check in-memory cache first. | |
| if d.cache.Has(object.IDHash(id)) { | |
| return true | |
| } | |
| // Fallback to indexdb as the source of truth to avoid false negatives on cache miss. | |
| md, err := d.indexdb.Get(ctx, id) | |
| if err != nil || md == nil { | |
| return false | |
| } | |
| return true |
| wrapped = append(wrapped, wrapBucket(b, w.checker)) | ||
| } | ||
| return wrapped | ||
| return w.base.Buckets() |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Buckets() method has been simplified to return w.base.Buckets() directly, removing the wrapping logic that was previously applied to each bucket. This means the buckets returned by this method will no longer have the Checker wrapper applied, which is inconsistent with how Select and SelectWithType wrap their returned buckets (lines 35 and 65). This inconsistency could lead to unexpected behavior where buckets obtained through Buckets() don't have expiration marking logic applied, while buckets obtained through Select do.
| return w.base.Buckets() | |
| baseBuckets := w.base.Buckets() | |
| if baseBuckets == nil { | |
| return nil | |
| } | |
| wrapped := make([]storagev1.Bucket, 0, len(baseBuckets)) | |
| for _, b := range baseBuckets { | |
| wrapped = append(wrapped, wrapBucket(b, w.checker)) | |
| } | |
| return wrapped |
| var targetTier string | ||
| switch src.StoreType() { | ||
| case storage.TypeCold: | ||
| targetTier = storage.TypeWarm |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent tier promotion logic. In the Promote method, when promoting from TypeCold, the target is set to TypeWarm (line 196). However, TypeWarm is semantically equivalent to TypeNormal according to the comments and implementation throughout the codebase. The SelectWithType method treats TypeWarm and TypeNormal as the same (line 155), so this works. However, for consistency and clarity, it would be better to use storage.TypeNormal here since that's what's actually used in the Demote method (line 175) for the reverse operation.
| targetTier = storage.TypeWarm | |
| targetTier = storage.TypeNormal |
| if index < len(availableChunks) { | ||
|
|
||
| chunkFile, _ := getSliceChunkFile(c, availableChunks[index]) | ||
| if chunkFile != nil { | ||
| if err := checkChunkSize(c, chunkFile, idx); err != nil { | ||
| _ = c.bucket.Discard(context.Background(), c.id) | ||
| return nil, 0, err | ||
| } | ||
| } | ||
|
|
||
| // 找到一个起始块,需要补齐前面的缺失块到当前这个块 | ||
| toByte := min(c.md.Size-1, uint64(availableChunks[index]*uint32(partSize))-1) | ||
| req := c.req.Clone(context.Background()) | ||
| newRange := fmt.Sprintf("bytes=%d-%d", fromByte, toByte) | ||
| req.Header.Set("Range", newRange) | ||
| // 找到一个起始块,需要补齐前面的缺失块到当前这个块 | ||
| toByte := min(c.md.Size-1, uint64(availableChunks[index]*uint32(partSize))-1) | ||
| req := c.req.Clone(context.Background()) | ||
| newRange := fmt.Sprintf("bytes=%d-%d", fromByte, toByte) | ||
| req.Header.Set("Range", newRange) | ||
|
|
||
| reader, err := c.getUpstreamReader(fromByte, toByte, true) | ||
| if err != nil { | ||
| return nil, 0, err | ||
| } | ||
| reader, err := c.getUpstreamReader(fromByte, toByte, true) | ||
| if err != nil { | ||
| return nil, 0, err | ||
| } | ||
|
|
||
| return iobuf.PartsReadCloser(reader, chunkFile), int(availableChunks[index]-reqChunks[from]) + 1, nil | ||
| return iobuf.PartsReadCloser(reader, chunkFile), int(availableChunks[index]-reqChunks[from]) + 1, nil | ||
| } |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code formatting change here removes the closing brace and changes the indentation, but the logic appears incomplete. The if chunkFile != nil block starting at line 191 doesn't have a proper else clause anymore. Looking at the original code flow, if chunkFile is nil, the code should still execute the "fill" logic at the bottom (now starting at line 212). However, the restructured code makes this flow less clear. Consider restructuring this to make the control flow more explicit and easier to follow.
| go func() { | ||
| for args := range pc.touchChan { | ||
| ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second) | ||
| if err := args.bucket.Touch(ctx, args.id); err != nil { | ||
| log.Warnf("failed to touch object %s: %v", args.id.String(), err) | ||
| } | ||
| args.unlock() | ||
| cancel() | ||
| } | ||
| }() |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double goroutine launch in startWorker. The method itself is called with 'go pc.startWorker()' on line 205, which makes it run in a goroutine. Then inside startWorker on line 212, there's another 'go func()' that launches the actual worker. This creates an unnecessary extra goroutine that immediately exits. Either remove the 'go' from line 205 or remove the 'go func()' wrapper on line 212.
| go func() { | |
| for args := range pc.touchChan { | |
| ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second) | |
| if err := args.bucket.Touch(ctx, args.id); err != nil { | |
| log.Warnf("failed to touch object %s: %v", args.id.String(), err) | |
| } | |
| args.unlock() | |
| cancel() | |
| } | |
| }() | |
| for args := range pc.touchChan { | |
| ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second) | |
| if err := args.bucket.Touch(ctx, args.id); err != nil { | |
| log.Warnf("failed to touch object %s: %v", args.id.String(), err) | |
| } | |
| args.unlock() | |
| cancel() | |
| } |
| MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 >= N | ||
| Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m | ||
| MaxObjectSize uint64 `json:"max_object_size" yaml:"max_object_size"` // 最大对象大小 | ||
| } | ||
| Demote struct { | ||
| MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 <= N | ||
| Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m | ||
| Occupancy float64 `json:"occupancy" yaml:"occupancy"` // 热盘存储占用率 >= N% | ||
| } | ||
| Tiering struct { | ||
| Enabled bool `json:"enabled" yaml:"enabled"` | ||
| Promote Promote `json:"promote" yaml:"promote"` // 升温 | ||
| Demote Demote `json:"demote" yaml:"demote"` // 降温 |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 100 contains Chinese text "时间窗口内命中 >= N" which translates to "hits within time window >= N". For consistency and to make the codebase more accessible to all contributors, consider using English for comments. The same issue appears on lines 101, 102, 105, 106, and 107.
| MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 >= N | |
| Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m | |
| MaxObjectSize uint64 `json:"max_object_size" yaml:"max_object_size"` // 最大对象大小 | |
| } | |
| Demote struct { | |
| MinHits int `json:"min_hits" yaml:"min_hits"` // 时间窗口内命中 <= N | |
| Window time.Duration `json:"window" yaml:"window"` // 时间窗口 10m | |
| Occupancy float64 `json:"occupancy" yaml:"occupancy"` // 热盘存储占用率 >= N% | |
| } | |
| Tiering struct { | |
| Enabled bool `json:"enabled" yaml:"enabled"` | |
| Promote Promote `json:"promote" yaml:"promote"` // 升温 | |
| Demote Demote `json:"demote" yaml:"demote"` // 降温 | |
| MinHits int `json:"min_hits" yaml:"min_hits"` // hits within the time window >= N | |
| Window time.Duration `json:"window" yaml:"window"` // time window (for example, 10m) | |
| MaxObjectSize uint64 `json:"max_object_size" yaml:"max_object_size"` // maximum object size | |
| } | |
| Demote struct { | |
| MinHits int `json:"min_hits" yaml:"min_hits"` // hits within the time window <= N | |
| Window time.Duration `json:"window" yaml:"window"` // time window (for example, 10m) | |
| Occupancy float64 `json:"occupancy" yaml:"occupancy"` // hot storage disk occupancy >= N% | |
| } | |
| Tiering struct { | |
| Enabled bool `json:"enabled" yaml:"enabled"` | |
| Promote Promote `json:"promote" yaml:"promote"` // promote (move to a hotter tier) | |
| Demote Demote `json:"demote" yaml:"demote"` // demote (move to a colder tier) |
| func (w *wrappedStorage) Promote(ctx context.Context, id *object.ID, src storagev1.Bucket) error { | ||
| return w.base.Promote(ctx, id, src) | ||
| } | ||
|
|
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Demote method implementation in wrappedStorage. While the Promote method is implemented on line 68-70, there is no corresponding Demote method. This is inconsistent with the Migration interface which requires both Promote and Demote methods (as defined in api/defined/v1/storage/storage.go lines 94-96). This will cause compilation errors or runtime failures when Demote is called on a wrappedStorage instance.
| func (w *wrappedStorage) Demote(ctx context.Context, id *object.ID, src storagev1.Bucket) error { | |
| return w.base.Demote(ctx, id, src) | |
| } |
| case storage.TypeWarm: | ||
| targetTier = storage.TypeHot |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Promote method case for TypeWarm is unreachable. The TypeNormal case on line 176 in the Demote method handles both TypeNormal and TypeWarm together, but in the Promote method on line 197, TypeWarm has its own case. However, buckets are categorized into normalBucket for both TypeNormal and TypeWarm (line 87), and their StoreType() will return whatever was configured. If a bucket is configured as "warm" type, this case would be reachable. But the inconsistency in how TypeNormal is handled (with a comment noting TypeWarm is the same) versus having a separate case for TypeWarm creates confusion. Consider either handling both together or ensuring the pattern is consistent between Promote and Demote.
| // Touch implements [storage.Bucket]. | ||
| func (m *memoryBucket) Touch(ctx context.Context, id *object.ID) error { | ||
| mark := m.cache.Get(id.Hash()) | ||
| if mark.LastAccess() <= 0 { | ||
| return nil | ||
| } | ||
|
|
||
| mark.SetLastAccess(time.Now().Unix()) | ||
| mark.SetRefs(mark.Refs() + 1) | ||
| m.cache.Set(id.Hash(), mark) | ||
| return nil |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Touch implementation for memory bucket doesn't check if the object exists in indexdb before touching. Unlike the disk bucket implementation which validates the object exists, this implementation directly gets from cache and modifies it. If mark.LastAccess() is 0 or less, it returns early, but this check is insufficient. If the cache entry doesn't exist, Get() will return a zero-value Mark, and the code will proceed to set incorrect values into the cache. Consider adding validation to ensure the object actually exists in the indexdb before updating the cache.
| } | ||
|
|
||
| resp.Header.Set(constants.ProtocolCacheStatusKey, strings.Join([]string{c.cacheStatus.String(), "from", c.opt.Hostname, "(tavern/4.0)"}, " ")) | ||
| resp.Header.Set(constants.ProtocolCacheStatusKey, strings.Join([]string{c.cacheStatus.String(), "from", c.opt.Hostname, c.bucket.StoreType(), "(tavern/4.0)"}, " ")) |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setXCache method will panic if c.bucket is nil. The method attempts to call c.bucket.StoreType() on line 94 without checking if bucket is nil. While this might be protected by callers, the method should be defensive and handle the nil case, especially since this is called in the response path where stability is critical.
This pull request introduces significant enhancements to the storage tiering and caching system, focusing on tiered storage configuration, new storage interface methods for object promotion/demotion, and improvements to cache key handling and processor structure. It also includes some refactoring for clarity and efficiency, and updates to logging and configuration files.
Storage Tiering and Interface Enhancements:
TypeWarm,TypeCold) and clarified the relationship betweennormalandwarmtypes inindexdb.go.StorageandBucketinterfaces to support tiering operations, including methods for promoting/demoting objects between tiers and selecting buckets by tier type. Introduced aMigrationinterface for handling tier transitions. [1] [2] [3] [4]conf.goto support a newTieringconfiguration, including detailed promote/demote policies and per-bucket tiering overrides.config.example.yamlfor easier setup and documentation.Caching and Processor Improvements:
ProcessorChainfrom a slice to a struct, added atouchChanfor asynchronous touch operations, and introduced atouchArgsstruct for better encapsulation.newObjectIDFromRequestto always return an object ID (never an error), and updated all usages accordingly. This change also removes unnecessary error handling and makes the cache key logic more robust. [1] [2] [3] [4] [5] [6] [7] [8]TryLockmethod to theCachingstruct to support non-blocking locking operations.Other Improvements and Fixes:
Getmethod to return a value (not a pointer), simplifying usage and preventing potential nil dereferences.InfotoDebugfor better log level control.setXCachemethod to include the storage type in the cache status header for better observability.These changes collectively improve the flexibility, maintainability, and observability of the caching and storage subsystem, especially around tiered storage management and cache key handling.