Skip to content

Conversation

@willg-nv
Copy link

@willg-nv willg-nv commented Dec 17, 2025

What does this PR do?

Type of change: new feature

Overview: This PR integrate automated Q/DQ placement tool to ModelOpt. This PR is 2/4 parts of the cahnges.

Part 1: #701
Part 2: #702
Part 3: #703
Part 4: #704

This PR contains the following changes:

  1. Implement RegionPattern to represent the topology structure of Regions. InsertionPoints are also defined on RegionPattern. Regions with same pattern are optimized at the same time
  2. Implement RegionSearch class to divide ONNX graph into small regions
  3. RegionSearch python file also provides an entry point to print out the region structures.
  4. Unit tests for new classse.

Usage

python -m modelopt.onnx.quantization.autotune.region_search --model model.onnx --verbose

Example output:

    ├─ Region 212 (Level 0, Type: COMPOSITE)
    │  ├─ Direct nodes: 0
    │  ├─ Total nodes (recursive): 9
    │  ├─ Children: 1
    │  ├─ Inputs: 3 tensors
    │  │    - xxx
    │  │    - xxx
    │  │    - xxx
    │  └─ Outputs: 1 tensors
    │       - xxx
    │
    │  Child regions:
    │
      ├─ Region 209 (Level 2, Type: LEAF) 
      │  ├─ Direct nodes: 9
      │  ├─ Total nodes (recursive): 9
      │  ├─ Children: 0
      │  ├─ Inputs: 11 tensors
      │  │    - xxx

Testing

Implemented unit tests for new classes. All unit tests could get pass locally.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: No, document change will be in part 4.
  • Did you update Changelog?: No. Change log will be included in part 4.

Additional Information

@willg-nv willg-nv requested a review from a team as a code owner December 17, 2025 06:29
@willg-nv willg-nv requested a review from ajrasane December 17, 2025 06:29
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@willg-nv willg-nv changed the title Dev willg integrate auto qdq placement part2 Integrate Automated QDQ placement tool - Part 2 Dec 17, 2025
@willg-nv
Copy link
Author

Hi @ajrasane , could you help me review this PR, thanks!

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch from 3f7ff31 to d3a6765 Compare December 31, 2025 02:16
@property
def is_empty(self) -> bool:
"""Check if pattern represents an empty region."""
return self.signature == "EMPTY" or self.size == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be and? Do we have case where one is empty/0 and the other is not?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified, self.signature == "EMPTY" and self.size == 0 are the same, both statement means "no node or child region in this"


return scheme

def format_tree(self, region: Region, graph: gs.Graph, indent: int = 0) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a small example of how this tree looks?

Copy link
Author

@willg-nv willg-nv Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added tests for region search to print tree structure, below is an example:

tests/unit/onnx/quantization/autotune/test_region_search.py::TestPrintTree::test_print_tree_top_down_builder
============================================================
Region Tree Structure:
============================================================
├─ Region 0 (Level 0, Type: LEAF)
│  ├─ Direct nodes: 2
│  ├─ Total nodes (recursive): 2
│  ├─ Children: 0
│  ├─ Inputs: 1 tensors
│  │    - input
│  └─ Outputs: 1 tensors
│       - output
│
│  Nodes in this region:
│    - Node 0: Conv (name: conv)
│    - Node 1: Relu (name: relu)
|
  ├─ <child region if exists>
============================================================

Currently, the 2 stage region partitioner only creates regions with depth <= 2.


Signature formats:
- Empty region: "EMPTY"
- Leaf region: "Op1->Op2->Op3" or "Op1[params]->Op2[params]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be saved as LEAF(ops)

Comment on lines +424 to +432
@staticmethod
def _compute_signature_recursive(region: Region, graph: gs.Graph) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify this to something like:

@staticmethod
def _compute_signature_recursive(region: Region, graph: gs.Graph) -> str:
    """Recursively compute structural signature for a region.
    
    ... (docstring unchanged) ...
    """
    nodes_list = list(graph.nodes)
    node_indices_set = region.get_nodes()
    
    # Collect direct node operations
    node_ops = [
        RegionPattern._make_node_with_params_signature(nodes_list[idx], graph, node_indices_set)
        for idx in sorted(node_indices_set)
        if idx < len(nodes_list)
    ]
    
    children = region.get_children()
    
    # LEAF region - no children
    if not children:
        return RegionPattern._make_node_signature(node_ops) if node_ops else "EMPTY"
    
    # COMPOSITE region - sort by (-level, size) for deterministic signatures
    sorted_children = sorted(children, key=lambda r: (-r.get_level(), r.get_total_size()))
    child_sigs = [RegionPattern._compute_signature_recursive(c, graph) for c in sorted_children]
    joined_children = RegionPattern._join_signatures(child_sigs)
    
    if node_ops:
        return f"COMPOSITE({RegionPattern._make_node_signature(node_ops)}|{joined_children})"
    return f"COMPOSITE({joined_children})"

region_node_indices: Set of node indices in the current region

Returns:
Signature string examples:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will the signatures of custom ops look? Could you provide an example?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example res-block signature:

COMPOSITE(Conv[kernel_shape=3x3]->BatchNormalization->Relu->Conv[kernel_shape=3x3]->BatchNormalization+Conv[kernel_shape=1x1]->BatchNormalization+Add->Relu)

graph structure:

                         Input
                           │
            ┌──────────────┴──────────────┐
            │                             │
            ▼                             ▼
    ┌───────────────┐             ┌───────────────┐
    │ Conv (3x3)    │             │ Conv (1x1)    │ (projection)
    └───────────────┘             └───────────────┘
            │                             │
            ▼                             ▼
    ┌───────────────┐             ┌───────────────┐
    │ BatchNorm     │             │ BatchNorm     │
    └───────────────┘             └───────────────┘
            │                             │
            ▼                             │
    ┌───────────────┐                     │
    │ Relu          │                     │
    └───────────────┘                     │
            │                             │
            ▼                             │
    ┌───────────────┐                     │
    │ Conv (3x3)    │                     │
    └───────────────┘                     │
            │                             │
            ▼                             │
    ┌───────────────┐                     │
    │ BatchNorm     │                     │
    └───────────────┘                     │
            │                             │
            └──────────────┬──────────────┘
                           ▼
                      ┌─────────┐
                      │   Add   │
                      └─────────┘
                           │
                           ▼
                      ┌─────────┐
                      │  Relu   │
                      └─────────┘

For custom plugin node, thier name will also be added to the signature.

return f"COMPOSITE({RegionPattern._join_signatures(child_signatures)})"

@staticmethod
def _make_node_with_params_signature(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has a lot of branching. Can we simplify it to something like:

@staticmethod
def _make_node_with_params_signature(
    node: gs.Node, graph: gs.Graph, region_node_indices: set
) -> str:
    """Create signature for a single node including its parameters."""
    op = node.op
    source_sig = ""
    attr_sig = ""

    # Build source signature for symmetric operations
    if op in SYMMETRIC_OPERATIONS and len(node.inputs) > 1:
        nodes_list = list(graph.nodes)
        node_to_idx = {id(n): idx for idx, n in enumerate(nodes_list)}

        input_sources = []
        for inp in node.inputs:
            if inp is None or not hasattr(inp, "inputs") or not inp.inputs:
                input_sources.append(("external", "input-or-constant"))
            else:
                producer_node = inp.inputs[0] if inp.inputs else None
                if producer_node and id(producer_node) in node_to_idx:
                    producer_idx = node_to_idx[id(producer_node)]
                    location = "internal" if producer_idx in region_node_indices else "external"
                    input_sources.append((location, producer_node.op))
                else:
                    input_sources.append(("external", "unknown"))

        sorted_sources = sorted(input_sources)
        source_sig = f"<{','.join(f'{src[0]}:{src[1]}' for src in sorted_sources)}>"

    # Build attribute signature
    if node.attrs:
        attr_parts = []
        for key in sorted(node.attrs.keys()):
            value = node.attrs[key]
            if isinstance(value, (list, tuple)) and value and all(isinstance(v, (int, float)) for v in value):
                value_str = "x".join(f"{v:.4g}" if isinstance(v, float) else str(v) for v in value)
            elif isinstance(value, (list, tuple)):
                value_str = ",".join(str(v) for v in value)
            elif isinstance(value, float):
                value_str = f"{value:.4g}"
            elif isinstance(value, bool):
                value_str = "1" if value else "0"
            elif isinstance(value, bytes):
                hex_str = value.hex()
                value_str = hex_str if len(hex_str) <= 16 else f"{hex_str[:16]}..."
            else:
                value_str = str(value)
            attr_parts.append(f"{key}={value_str}")
        attr_sig = f"[{','.join(attr_parts)}]"

    return f"{op}{attr_sig}{source_sig}"

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch from d3a6765 to 616285d Compare January 8, 2026 08:13
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch from 616285d to c95939a Compare January 8, 2026 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants