SATtxt - Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery

Minh Kha Do, Wei Xiang, Kang Han, Di Wu, Khoa Phan, Yi-Ping Phoebe Chen, Gaowen Liu, Ramana Rao Kompella

La Trobe University, Cisco Research

📰 News

Date	Update
Mar 9, 2026	We have released model code and weights.
Feb 23, 2026	SATtxt is accepted at CVPR 2026. We appreciate the reviewers and ACs.

Overview

SATtxt is a vision-language foundation model for satellite imagery. We train only the projection heads, keeping both encoders frozen.

Component	Backbone	Parameters
Vision Encoder	DINOv3 ViT-L/16	Frozen
Text Encoder	LLM2Vec Llama-3-8B	Frozen
Vision Head	Transformer Projection	Trained
Text Head	Linear Projection	Trained

Installation

git clone https://github.com/ikhado/sattxt.git
cd sattxt
pip install -r requirements.txt
pip install flash-attn --no-build-isolation  # Required for LLM2Vec

Model Weights

Download the required weights:

Component	Source
DINOv3 ViT-L/16	facebookresearch/dinov3 → `dinov3_vitl16_pretrain_sat493m.pth`
LLM2Vec	McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-unsup-simcse
Vision Head	sattxt_vision_head.pt
Text Head	sattxt_text_head.pt

Clone DINOv3 into the thirdparty folder:

cd thirdparty && git clone https://github.com/facebookresearch/dinov3.git

Quick Start

import sys
from pathlib import Path

import torch

sys.path.insert(0, str(Path(__file__).resolve().parent / "thirdparty" / "dinov3"))

from sattxt.model import SATtxt
from sattxt.utils import image_loader, get_preprocess, zero_shot_classify
device = "cuda:0" if torch.cuda.is_available() else "cpu"

model = SATtxt(
    dinov3_weights_path="/PATH/TO/dinov3_vitl16_pretrain_sat493m-eadcf0ff.pth",
    sattxt_vision_head_pretrain_weights="/PATH/TO/sattxt_vision_head.pt",
    text_encoder_id="McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp",
    sattxt_text_head_pretrain_weights="/PATH/TO/sattxt_text_head.pt",
).to(device).eval()

categories = [
    "AnnualCrop", "Forest", "HerbaceousVegetation", "Highway", "Industrial",
    "Pasture", "PermanentCrop", "Residential", "River", "SeaLake"
]

image = image_loader("./asset/Residential_167.jpg")
image_tensor = get_preprocess(is_ms=False, all_bands=False)(image).unsqueeze(0).to(device)

logits, pred_idx = zero_shot_classify(model, image_tensor, categories)

print("Prediction:", categories[pred_idx.item()])

Please check demo.py for more details.

Citation

@misc{do2026sattxt,
      title={Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery}, 
      author={Minh Kha Do and Wei Xiang and Kang Han and Di Wu and Khoa Phan and Yi-Ping Phoebe Chen and Gaowen Liu and Ramana Rao Kompella},
      year={2026},
      eprint={2602.22613},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.22613}, 
}

Acknowledgements

We pretrained the model with: Lightning-Hydra-Template

We use evaluation scripts from: MS-CLIP and Pangaea-Bench

We also use LLMs (such as ChatGPT and Claude) for code refactoring.

This work was supported in part by the Australian Government through the Australian Research Council’s Discovery Projects Funding Scheme under Project DP220101634, and by the NVIDIA Academic Grant Program.

We welcome contributions and issues to further improve SATtxt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SATtxt - Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery

📰 News

Overview

Installation

Model Weights

Quick Start

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
asset		asset
sattxt		sattxt
thirdparty		thirdparty
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SATtxt - Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery

📰 News

Overview

Installation

Model Weights

Quick Start

Citation

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages