perf: improve compression speed and ratio in lzsst2_file_compression.pl by Copilot · Pull Request #5 · trizen/perl-scripts

Copilot · 2026-03-04T15:15:03Z

LZ77+Huffman compressor had O(n²log n) Huffman tree construction, no early exit in match search, redundant lazy-match probes, and a suboptimal chunk size and chain length.

LZ77 (`find_match` / `lz77_compression`)

Early termination: last if ($best_n > $max_len) in match loop — no point scanning further once a 258-byte match is found
Selective hash updates: for matches > 32 bytes, only insert the first 4 and last 4 positions into the hash table instead of every position
Skip redundant lazy match: only probe la+1 when a match was already found at la ($n1 > 1)
max_chain_len: 48 → 64

Huffman (`mktree_from_freq`)

Replaced O(n²log n) full re-sort loop with O(n log n) insertion-sort using binary search on the already-sorted remainder:

my ($lo, $hi) = (0, scalar(@nodes));
while ($lo < $hi) {
    my $mid = ($lo + $hi) >> 1;
    if ($nodes[$mid][1] < $weight) { $lo = $mid + 1 }
    else                           { $hi = $mid     }
}
splice(@nodes, $lo, 0, $new_node);

Single-symbol frequency tables are explicitly wrapped so the symbol receives code '0' (≥1 bit), matching original behavior and preventing a zero-length encoding that breaks decompression.

`find_distance_index`

Linear scan → binary search (O(n) → O(log n)).

`CHUNK_SIZE`

1 << 19 → 1 << 21 (512 KB → 2 MB) for a larger LZ77 window and better compression ratio.

Original prompt

Objective

Improve both the runtime performance and compression ratio of Compression/lzsst2_file_compression.pl, which implements LZ77 compression (LZSS variant with hash tables and lazy matching) + Huffman coding, using a DEFLATE-like approach.

The changes must preserve full backward-compatible decompression — i.e., archives produced by the new code must still decompress correctly, and the decompressor should remain unchanged in behavior.

File to modify

Compression/lzsst2_file_compression.pl

Current bottlenecks and areas for improvement

After careful analysis, these are the specific improvements to make:

1. LZ77 Compression — Performance (`lz77_compression` and `find_match`)

a) Early termination in find_match when max_len is reached:

In find_match, the inner foreach loop iterates over all positions in the chain even after finding a maximum-length match (258). Add an early exit:

if ($n > $best_n) {
    $best_p = $p;
    $best_n = $n;
    last if ($best_n > $max_len);  # can't do better
}

b) Skip updating the hash table for positions already covered by a long match:

Currently, every position inside a match is inserted into the hash table. For very long matches this is wasteful. When $best_n is large (e.g., > 32), only insert the first few and last few positions into the hash table instead of all of them, to save time without significantly affecting compression ratio:

# Only insert boundary positions for long matches to save time
if ($best_n > 32) {
    # Insert first 4 and last 4 positions
    foreach my $i (0 .. 3, $best_n - $min_len - 3 .. $best_n - $min_len) {
        next if $i < 0 or $i > length($matched) - $min_len;
        my $key = substr($matched, $i, $min_len);
        unshift @{$table{$key}}, $la + $i;
        if (scalar(@{$table{$key}}) > $max_chain_len) {
            pop @{$table{$key}};
        }
    }
}
else {
    # Original behavior for short matches
    foreach my $i (0 .. length($matched) - $min_len) {
        my $key = substr($matched, $i, $min_len);
        unshift @{$table{$key}}, $la + $i;
        if (scalar(@{$table{$key}}) > $max_chain_len) {
            pop @{$table{$key}};
        }
    }
}

c) Avoid redundant find_match call at la+1 when no match exists at la:

The lazy matching currently always tries to find a match at la+1 even when it's unlikely to help. Only do the second find_match call when the first match was found (i.e., $n1 > 1):

if ($n1 > 1 and exists($table{$lookahead2})) {
    ($n2, $p2) = find_match(\$str, $la + 1, $min_len, $max_len, $end, \%table, \@symbols);
}

2. Increase default CHUNK_SIZE for better compression ratio

The current CHUNK_SIZE is 1 << 19 (512 KB). Increase it to 1 << 21 (2 MB). Larger chunks give the LZ77 engine a bigger window to find matches, improving compression ratio at the cost of slightly more memory (which is acceptable for modern systems). The script already says "higher value = better compression":

CHUNK_SIZE => 1 << 21,    # higher value = better compression

3. `find_distance_index` — Use binary search instead of linear scan

The find_distance_index function uses a linear scan through @DISTANCE_SYMBOLS to find the right bucket for a distance value. Replace this with a binary search for O(log n) lookup:

sub find_distance_index ($dist, $distance_symbols) {
    my ($lo, $hi) = (0, $#{$distance_symbols});
    while ($lo < $hi) {
        my $mid = ($lo + $hi + 1) >> 1;
        if ($distance_symbols->[$mid][0] <= $dist) {
            $lo = $mid;
        }
        else {
            $hi = $mid - 1;
        }
    }
    return $lo;
}

4. Huffman tree construction — avoid repeated full sorts

In mktree_from_freq, the "poor man's priority queue" re-sorts the full @nodes array on every iteration. This is O(n² log n). Instead, use an insertion-sort approach: after splicing the two smallest nodes and creating a merged node, insert the new node into the already-sorted remainder at the correct position using binary search:

sub mktree_from_freq ($freq) {

    my @nodes = map { [$_, $freq->{$_}] } sort { $a <=> $b } keys %$freq;
    @nodes = sort { $a->[1] <=> $b->[1] } @nodes;

    while (@nodes > 1) {
        my ($x, $y) = splice(@nodes, 0, 2);
        my $new_node;
        if (defined($y)) {
            $new_node = [[$x, $y], $x->[1] + $y->[1]];
        }
        else {
            $new_node = [[$x], $x->[1]];
        }

        # Binary search for insertion position
        my $weight = $new_node->[1];
        my ($lo, $hi) = (0, scalar(@nodes));
        while ($lo < $hi) {
            my $mid = ($lo + $hi) >> 1;
            if ($nodes[$mid][1] < $weight) {
                $lo = $mid + 1;
            }
            else {
                $hi = $mid;
            }
        }
        splice(@nodes, $lo, 0, $new_node);
    }

    walk($nodes[0], '', {}, {});
}

5. Increase `max_cha...

This pull request was created from Copilot chat.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…ression.pl Co-authored-by: trizen <614513+trizen@users.noreply.github.com>

Initial plan

a1d450d

Copilot AI assigned Copilot and trizen Mar 4, 2026

Copilot started work on behalf of trizen March 4, 2026 15:15 View session

Improve runtime performance and compression ratio of lzsst2_file_comp…

962962d

…ression.pl Co-authored-by: trizen <614513+trizen@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Improve runtime performance and compression ratio in lzsst2~~ perf: improve compression speed and ratio in lzsst2_file_compression.pl Mar 4, 2026

Copilot finished work on behalf of trizen March 4, 2026 15:28

trizen closed this Mar 4, 2026

trizen deleted the copilot/improve-performance-compression branch March 4, 2026 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: improve compression speed and ratio in lzsst2_file_compression.pl#5

perf: improve compression speed and ratio in lzsst2_file_compression.pl#5
Copilot wants to merge 2 commits intomasterfrom
copilot/improve-performance-compression

Copilot AI commented Mar 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LZ77 (find_match / lz77_compression)

Huffman (mktree_from_freq)

find_distance_index

CHUNK_SIZE

Objective

File to modify

Current bottlenecks and areas for improvement

1. LZ77 Compression — Performance (lz77_compression and find_match)

2. Increase default CHUNK_SIZE for better compression ratio

3. find_distance_index — Use binary search instead of linear scan

4. Huffman tree construction — avoid repeated full sorts

5. Increase `max_cha...

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 4, 2026 •

edited

Loading

LZ77 (`find_match` / `lz77_compression`)

Huffman (`mktree_from_freq`)

`find_distance_index`

`CHUNK_SIZE`

1. LZ77 Compression — Performance (`lz77_compression` and `find_match`)

3. `find_distance_index` — Use binary search instead of linear scan