Skip to content

Add how-to guide for storage data migration #2228

@leighmcculloch

Description

@leighmcculloch

What problem does your feature solve?

When contracts upgrade and their data structures change (e.g., adding new fields), developers need clear guidance on how to handle existing stored data. The intuitive approaches don't work as expected, and developers often discover this the hard way.

Intuitive approach 1 - Just read with the new type:

#[contracttype]
pub struct DataV1 { a: i64, b: i64 }

#[contracttype]
pub struct DataV2 { a: i64, b: i64, c: Option<i64> }

// Developer expects c = None for old entries
let data: DataV2 = env.storage().persistent().get(&key).unwrap();
// TRAPS with Error(Object, UnexpectedSize)

Intuitive approach 2 - Use try_from_val fallback:

let raw: Val = env.storage().persistent().get(&key).unwrap();
if let Ok(v2) = DataV2::try_from_val(&env, &raw) {
    v2
} else {
    // Never reached - host traps before returning Err
    let v1 = DataV1::try_from_val(&env, &raw).unwrap();
    DataV2 { a: v1.a, b: v1.b, c: None }
}

Both approaches trap at the host level before the SDK can handle the mismatch. There is no documentation explaining this behavior or the correct pattern to use.

What would you like to see?

A new how-to guide added to https://developers.stellar.org/docs/build/guides/storage explaining storage data migration using the version marker pattern:

#[contracttype]
pub struct DataV1 { a: i64, b: i64 }

#[contracttype]
pub struct DataV2 { a: i64, b: i64, c: Option<i64> }

#[contracttype]
pub enum DataKey {
    DataVersion(u32),  // version marker keyed by id
    Data(u32),         // data keyed by id
}

fn read_data(env: &Env, id: u32) -> DataV2 {
    let version: u32 = env.storage().persistent()
        .get(&DataKey::DataVersion(id))
        .unwrap_or(1);  // default to v1 for entries without version marker
    
    match version {
        1 => {
            let v1: DataV1 = env.storage().persistent().get(&DataKey::Data(id)).unwrap();
            DataV2 { a: v1.a, b: v1.b, c: None }
        }
        _ => env.storage().persistent().get(&DataKey::Data(id)).unwrap(),
    }
}

fn write_data(env: &Env, id: u32, data: &DataV2) {
    env.storage().persistent().set(&DataKey::DataVersion(id), &2u32);
    env.storage().persistent().set(&DataKey::Data(id), data);
}

The guide should cover:

  1. Why intuitive approaches fail - Host validates field count before SDK can handle mismatches
  2. Version marker pattern - Store version alongside data, check before reading
  3. Lazy vs eager migration - Convert on read (lazy) vs batch migration (eager), noting that lazy migration is generally preferred on blockchains since batch migrations can be prohibitively expensive or hit resource limits
  4. Testing migrations - How to test upgrade paths

What alternatives are there?

  • Developers currently discover this through trial and error

Related:

Metadata

Metadata

Type

No type

Projects

Status

To Do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions