FlashPPI: Linear-time prediction of proteome-scale microbial protein interactions

Model Description

FlashPPI is a contrastively trained model for protein-protein interaction (PPI) prediction, grounded in residue-level interactions, that enables full-proteome interaction prediction in minutes.

By reframing PPI prediction as a dense retrieval task, FlashPPI circumvents the $\mathcal{O}(N^2)$ computational bottleneck of traditional all-vs-all structural screening.

Scalable: Reduces proteome-wide screening from days/months to minutes.
Interpretable: Predicts fine-grained, residue-level 2D contact maps for retrieved interaction candidates.
Genomic Priors: Leverages gLM2 initialization to capture cross-protein, multi-gene co-evolutionary signals.

Web Server

FlashPPI is integrated into seqhub.org. You can upload a FASTA and interactively explore whole-proteome networks and contact maps. Explore an example network here.

Installation

pip install -r requirements.txt

Optionally, install Flash Attention for faster inference on GPU:

pip install flash-attn --no-build-isolation

Usage

Fast Proteome-wide PPI Screening (All-vs-All)

Run the prediction script by passing your proteome FASTA file. It will output a predictions file with predicted pairs of interacting proteins and confidence scores. Note: Requires a machine with at least 1 GPU.

python predict_proteome.py --fasta my_proteome.fasta --output predictions.csv

Cross-Proteome PPI Screening (Host–Viral)

Predict interactions between two proteomes, for example a viral genome and its host genome.

python predict_cross_proteome.py \
    --host_fasta host.fasta \
    --viral_fasta virus.fasta \
    --output predictions.csv

Visualizing contact predictions

import torch
import matplotlib.pyplot as plt
from transformers import AutoModel, AutoTokenizer

seq1 = "MKTAYIAKQRQISFVKSHFSRQL"
seq2 = "MSTAGKVIKCKAAVLW"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained("tattabio/flashppi", trust_remote_code=True)
model = AutoModel.from_pretrained("tattabio/flashppi", trust_remote_code=True).to(device).eval()

inputs1 = tokenizer(seq1, return_tensors="pt").to(device)
inputs2 = tokenizer(seq2, return_tensors="pt").to(device)

with torch.no_grad():
    outputs = model(
        input_ids1=inputs1["input_ids"],
        attention_mask1=inputs1["attention_mask"],
        input_ids2=inputs2["input_ids"],
        attention_mask2=inputs2["attention_mask"],
        return_dict=True
    )

# Extract map and trim padding
contact_map = outputs.contact_map[0].cpu().numpy()
len1, len2 = inputs1["attention_mask"].sum().item(), inputs2["attention_mask"].sum().item()
contact_map = contact_map[:len1, :len2]

plt.imshow(contact_map, cmap="Blues", vmin=0, vmax=1)
plt.savefig("contact_map.png")

Training

# Multi-GPU
accelerate launch --config_file configs_accelerate/multi_gpu.yaml -m flashppi.train configs_train/flashppi.yaml

# Single CPU (testing)
accelerate launch --config_file configs_accelerate/cpu.yaml -m flashppi.train configs_train/flashppi_ESM_small.yaml

License

This repository is licensed under the CC BY-NC 4.0 license. Free for academic and research use.

Citing

If you use FlashPPI or our datasets in your research, please cite:

@article {Cornman2026FlashPPI,
	author = {Cornman, Andre and Tranzillo, Matt and Zulaybar, Nicolo G and Bouzit, Imane and Hwang, Yunha},
	title = {Linear-time prediction of proteome-scale microbial protein interactions},
	year = {2026},
	doi = {10.64898/2026.03.01.708874},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2026/03/02/2026.03.01.708874},
	journal = {bioRxiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
configs_accelerate		configs_accelerate
configs_train		configs_train
docs/images		docs/images
flashppi		flashppi
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
predict_cross_proteome.py		predict_cross_proteome.py
predict_proteome.py		predict_proteome.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlashPPI: Linear-time prediction of proteome-scale microbial protein interactions

Model Description

Web Server

Installation

Usage

Fast Proteome-wide PPI Screening (All-vs-All)

Cross-Proteome PPI Screening (Host–Viral)

Visualizing contact predictions

Training

License

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FlashPPI: Linear-time prediction of proteome-scale microbial protein interactions

Model Description

Web Server

Installation

Usage

Fast Proteome-wide PPI Screening (All-vs-All)

Cross-Proteome PPI Screening (Host–Viral)

Visualizing contact predictions

Training

License

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages