Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -333,6 +333,7 @@
"tutorials/utility/video-segment-sam3",
"tutorials/utility/remove-background-birefnet",
"tutorials/utility/moge",
"tutorials/utility/pid-latent-upscale/pid-latent-upscale",
{
"group": "Face Detection",
"pages": [
Expand Down Expand Up @@ -2759,6 +2760,7 @@
"zh/tutorials/utility/video-segment-sam3",
"zh/tutorials/utility/remove-background-birefnet",
"zh/tutorials/utility/moge",
"zh/tutorials/utility/pid-latent-upscale/pid-latent-upscale",
{
"group": "人脸检测",
"pages": [
Expand Down Expand Up @@ -5190,6 +5192,7 @@
"ja/tutorials/utility/video-segment-sam3",
"ja/tutorials/utility/remove-background-birefnet",
"ja/tutorials/utility/moge",
"ja/tutorials/utility/pid-latent-upscale/pid-latent-upscale",
{
"group": "顔検出",
"pages": [
Expand Down
106 changes: 106 additions & 0 deletions ja/tutorials/utility/pid-latent-upscale/pid-latent-upscale.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: "PiD 潜在空間アップスケール ComfyUI ワークフロー例"
description: "PiD(ピクセル拡散デコーダー)は、拡散モデルの潜在表現を4ステップのピクセル空間蒸留で4倍超解像画像に変換します。個別のVAEデコードは不要です。"
sidebarTitle: "PiD 潜在空間アップスケール"
---

import UpdateReminder from '/snippets/ja/tutorials/update-reminder.mdx'

**PiD(ピクセル拡散デコーダー)** は、拡散モデルの **潜在表現** を 4 ステップのピクセル空間蒸留で直接 **4 倍超解像画像** に変換します。個別の VAE デコードは不要です。このワークフローでは、PiD を使用して Z-Image-Turbo の潜在表現を **1024px から 4096px** にアップスケールする方法を紹介します。

**関連リンク**:
- [Comfy-Org/PixelDiT Hugging Face リポジトリ](https://huggingface.co/Comfy-Org/PixelDiT)
- [nvidia/PixelDiT-1300M-1024px(公式リリース)](https://huggingface.co/nvidia/PixelDiT-1300M-1024px)

<img src="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/main/templates/utility_pid_latent_upscale_dit-1.webp" alt="PiD 潜在空間アップスケールワークフロー" />

<UpdateReminder />

<CardGroup cols={1}>
<Card title="ワークフローをダウンロード" icon="download" href="https://github.com/Comfy-Org/workflow_templates/blob/main/templates/utility_pid_latent_upscale_dit.json">
JSON をダウンロード、またはテンプレートライブラリで "PiD Latent Upscale" を検索
</Card>
</CardGroup>

## PiD の動作原理

PiD は、アップストリームモデルの **VAE/潜在空間** に基づいてチェックポイントを選択します(モデル名だけでは判断しません)。初期生成に使用したモデルの VAE 潜在空間に対応する PiD チェックポイントを選択する必要があります。

このワークフローでは **Z-Image-Turbo**(1024px 潜在空間 → 4096px 出力)を使用します。Z-Image-Turbo は Flux.1 と同じ 16チャンネル VAE を共有しています。

<Card title="サブグラフについて" icon="book-open" href="/ja/interface/features/subgraph">
このワークフローはサブグラフノードを使用してモジュール化された処理を行います。サブグラフのドキュメントを参照して、ワークフローをカスタマイズおよび拡張する方法を学んでください。
</Card>

### 利用可能な PiD チェックポイント

すべてのチェックポイントは [Comfy-Org/PixelDiT](https://huggingface.co/Comfy-Org/PixelDiT) からダウンロードします → `models/diffusion_models/`。

| チェックポイント | 入力 → 出力 | 互換性のある潜在空間(VAE バックボーン) |
|---|---|---|
| [`pid_flux1_512_to_2048_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux1_512_to_2048_4step_bf16.safetensors) | 512 → 2048 | Flux1-dev 16-ch VAE(Flux.1, Z-Image) |
| [`pid_flux1_1024_to_4096_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux1_1024_to_4096_4step_bf16.safetensors) | 1024 → 4096 | Flux1-dev 16-ch VAE(Flux.1, Z-Image)**(本ワークフロー)** |
| [`pid_flux2_512_to_2048_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux2_512_to_2048_4step_bf16.safetensors) | 512 → 2048 | Flux2-dev 128-ch VAE(Flux.2) |
| [`pid_flux2_1024_to_4096_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux2_1024_to_4096_4step_bf16.safetensors) | 1024 → 4096 | Flux2-dev 128-ch VAE(Flux.2) |
| [`pid_sd3_512_to_2048_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_sd3_512_to_2048_4step_bf16.safetensors) | 512 → 2048 | SD3 medium 16-ch VAE |
| [`pid_sd3_1024_to_4096_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_sd3_1024_to_4096_4step_bf16.safetensors) | 1024 → 4096 | SD3 medium 16-ch VAE |

### サブグラフ設定

**Latent Upscale Decode (PiD)** サブグラフノードで以下のパラメータを設定します:

| パラメータ | 値 | 説明 |
|---|---|---|
| `latent_format` | `flux` | `flux` は Flux.1/Flux.2/Z-Image、`sd3` は SD3 用(Flux.2 は 128 チャンネルで自動検出) |
| `degrade_sigma` | `0.0` | 入力潜在表現の「完成度」。`0.0` は完全ノイズ除去済み(デフォルト)、`0.1–0.8` は部分ノイズ除去、`1.0` は純粋ノイズ |

### 実行手順

1. **潜在表現を生成** — T2I ワークフロー(例:Z-Image-Turbo)で潜在画像を生成
2. **PiD に接続** — 潜在表現を **Latent Upscale Decode (PiD)** サブグラフノードに入力
3. **チェックポイントを選択** — アップストリームモデルの VAE 潜在空間に一致する PiD チェックポイントを選択
4. **出力サイズを設定** — PiD の出力サイズを入力潜在解像度の **4倍** に設定
5. **実行** — サブグラフが 1 回の 4 ステップ推論でデコードと超解像を実行

## モデルダウンロード

PiD は PixelDiT モデルファミリーの一部です。このワークフローでは初期生成に Z-Image-Turbo モデルも必要です。

<CardGroup cols={2}>
<Card title="PiD Flux.1 1024→4096" icon="download" href="https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux1_1024_to_4096_4step_bf16.safetensors">
pid_flux1_1024_to_4096_4step_bf16.safetensors — Flux.1 / Z-Image 潜在空間用 PiD チェックポイント
</Card>
<Card title="Z-Image-Turbo 拡散モデル" icon="download" href="https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/diffusion_models/z_image_turbo_bf16.safetensors">
z_image_turbo_bf16.safetensors — PiD アップスケール前の初期生成に使用
</Card>
</CardGroup>

<CardGroup cols={2}>
<Card title="テキストエンコーダー (Z-Image)" icon="download" href="https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/text_encoders/qwen_3_4b.safetensors">
qwen_3_4b.safetensors — Z-Image-Turbo テキストエンコーダー
</Card>
<Card title="VAE (Z-Image)" icon="download" href="https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/vae/ae.safetensors">
ae.safetensors — Z-Image-Turbo の VAE
</Card>
</CardGroup>

> PiD は内蔵の **pixel_space** VAE を使用するため、PiD 自体に別途 VAE ファイルは不要です。

### モデル保存場所

```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ └── qwen_3_4b.safetensors
│ ├── 📂 diffusion_models/
│ │ ├── pid_flux1_1024_to_4096_4step_bf16.safetensors
│ │ └── z_image_turbo_bf16.safetensors
│ └── 📂 vae/
│ └── ae.safetensors
```


### サンプル出力

<img src="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/main/output/utility_pid_latent_upscale_dit.png" alt="PiD サンプル出力" />
106 changes: 106 additions & 0 deletions tutorials/utility/pid-latent-upscale/pid-latent-upscale.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: "PiD Latent Upscale ComfyUI Workflow Example"
description: "PiD (Pixel Diffusion Decoder) turns a diffusion latent into a 4× super-resolved image in 4 distilled pixel-space steps — no separate VAE decode needed."
sidebarTitle: "PiD Latent Upscale"
---

import UpdateReminder from '/snippets/tutorials/update-reminder.mdx'

**PiD (Pixel Diffusion Decoder)** turns a diffusion **latent** into a **4× super-resolved image** in 4 distilled pixel-space steps — no separate VAE decode needed. This workflow demonstrates using PiD to upscale a Z-Image-Turbo latent from **1024px → 4096px**.

**Related Links**:
- [Comfy-Org/PixelDiT on Hugging Face](https://huggingface.co/Comfy-Org/PixelDiT)
- [nvidia/PixelDiT-1300M-1024px (official release)](https://huggingface.co/nvidia/PixelDiT-1300M-1024px)

<img src="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/main/templates/utility_pid_latent_upscale_dit-1.webp" alt="PiD Latent Upscale workflow" />

<UpdateReminder />

<CardGroup cols={1}>
<Card title="Download Workflow" icon="download" href="https://github.com/Comfy-Org/workflow_templates/blob/main/templates/utility_pid_latent_upscale_dit.json">
Download JSON or search "PiD Latent Upscale" in Template Library
</Card>
</CardGroup>

## How PiD works

PiD matches checkpoints by the **VAE / latent space** of the upstream model (the encoder side), not the diffusion model name alone. You need to select the PiD checkpoint that corresponds to the latent space of the model used for initial generation.

This workflow uses **Z-Image-Turbo** (1024px latent → 4096px output) which shares Flux.1's 16-ch VAE.

<Card title="Learn about Subgraph" icon="book-open" href="/interface/features/subgraph">
This workflow uses Subgraph nodes for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.
</Card>

### Available PiD checkpoints

All checkpoints are downloaded from [Comfy-Org/PixelDiT](https://huggingface.co/Comfy-Org/PixelDiT) → `models/diffusion_models/`.

| Checkpoint | Input → Output | Compatible latent (VAE backbone) |
|---|---|---|
| [`pid_flux1_512_to_2048_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux1_512_to_2048_4step_bf16.safetensors) | 512 → 2048 | Flux1-dev 16-ch VAE (Flux.1, Z-Image) |
| [`pid_flux1_1024_to_4096_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux1_1024_to_4096_4step_bf16.safetensors) | 1024 → 4096 | Flux1-dev 16-ch VAE (Flux.1, Z-Image) **(this workflow)** |
| [`pid_flux2_512_to_2048_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux2_512_to_2048_4step_bf16.safetensors) | 512 → 2048 | Flux2-dev 128-ch VAE (Flux.2) |
| [`pid_flux2_1024_to_4096_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux2_1024_to_4096_4step_bf16.safetensors) | 1024 → 4096 | Flux2-dev 128-ch VAE (Flux.2) |
| [`pid_sd3_512_to_2048_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_sd3_512_to_2048_4step_bf16.safetensors) | 512 → 2048 | SD3 medium 16-ch VAE |
| [`pid_sd3_1024_to_4096_4step_bf16`](https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_sd3_1024_to_4096_4step_bf16.safetensors) | 1024 → 4096 | SD3 medium 16-ch VAE |

### Subgraph settings

Configure these on the **Latent Upscale Decode (PiD)** subgraph:

| Setting | Value | Description |
|---|---|---|
| `latent_format` | `flux` | `flux` for Flux.1/Flux.2/Z-Image, `sd3` for SD3 (Flux.2 is auto-detected by 128 channels) |
| `degrade_sigma` | `0.0` | How "finished" the input latent is. `0.0` for fully denoised (default), `0.1–0.8` for partially denoised, `1.0` for pure noise |

### Steps to run

1. **Generate a latent** — use a T2I workflow (e.g., Z-Image-Turbo) to produce a latent image
2. **Connect to PiD** — feed the latent into the **Latent Upscale Decode (PiD)** subgraph node
3. **Select checkpoint** — choose the PiD checkpoint matching your upstream model's VAE latent space
4. **Set output size** — set PiD output size to **4×** the input latent resolution
5. **Run** — the subgraph decodes and upscales in a single 4-step pass

## Model downloads

PiD is part of the PixelDiT model family. This workflow also requires the Z-Image-Turbo model for initial generation.

<CardGroup cols={2}>
<Card title="PiD Flux.1 1024→4096" icon="download" href="https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pid_flux1_1024_to_4096_4step_bf16.safetensors">
pid_flux1_1024_to_4096_4step_bf16.safetensors — PiD checkpoint for Flux.1 / Z-Image latent space
</Card>
<Card title="Z-Image-Turbo diffusion" icon="download" href="https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/diffusion_models/z_image_turbo_bf16.safetensors">
z_image_turbo_bf16.safetensors — used for initial generation before PiD upscale
</Card>
</CardGroup>

<CardGroup cols={2}>
<Card title="Text Encoder (Z-Image)" icon="download" href="https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/text_encoders/qwen_3_4b.safetensors">
qwen_3_4b.safetensors — text encoder for Z-Image-Turbo
</Card>
<Card title="VAE (Z-Image)" icon="download" href="https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/vae/ae.safetensors">
ae.safetensors — VAE for Z-Image-Turbo
</Card>
</CardGroup>

> PiD uses a built-in **pixel_space** VAE — no separate VAE file is needed for PiD itself.

### Model storage location

```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ └── qwen_3_4b.safetensors
│ ├── 📂 diffusion_models/
│ │ ├── pid_flux1_1024_to_4096_4step_bf16.safetensors
│ │ └── z_image_turbo_bf16.safetensors
│ └── 📂 vae/
│ └── ae.safetensors
```


### Sample output

<img src="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/main/output/utility_pid_latent_upscale_dit.png" alt="PiD sample output" />
Loading
Loading