Skip to content

fix(ascend):update README#673

Merged
chenghuaWang merged 2 commits into
UbiquitousLearning:mainfrom
lywbarca:ascend
Jun 8, 2026
Merged

fix(ascend):update README#673
chenghuaWang merged 2 commits into
UbiquitousLearning:mainfrom
lywbarca:ascend

Conversation

@lywbarca

@lywbarca lywbarca commented May 6, 2026

Copy link
Copy Markdown

Please check Guidelines for Contributing.

Summary by CodeRabbit

  • New Features

    • Ascend NPU backend support added with Qwen3 W8A8 quantized inference.
  • Improvements

    • More reliable UTF-8-safe streaming output and buffered token handling for smoother text rendering.
    • Added a “Latest News” announcement (2026-05-02) and updated the supported models listing to link Ascend W8A8 details.

@coderabbitai

coderabbitai Bot commented May 6, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: da29a1fe-2957-4718-af56-69a0c1881f6c

📥 Commits

Reviewing files that changed from the base of the PR and between 3973c6a and 4aaf858.

📒 Files selected for processing (1)
  • README.md

📝 Walkthrough

Walkthrough

This PR adds UTF-8-safe streaming output for Qwen Ascend: the tokenizer initializes with a UTF-8 locale, examples include a takeValidUtf8Prefix utility, the QA generation path gains a prefill warmup and buffered token decoding that prints only complete UTF-8 chunks, and README news/table entries are updated.

Changes

UTF-8 Safe Streaming and Tokenizer Configuration

Layer / File(s) Summary
Locale and Tokenizer Setup
mllm/models/qwen_ascend/tokenization_qwen_ascend.hpp
QwenAscendTokenizer constructor now explicitly initializes the local environment with UTF-8 locale via preprocessor::initLocal("C.UTF-8").
UTF-8 Validation Utilities
examples/qwen_ascend/main.cpp (lines 6–38)
Anonymous namespace function takeValidUtf8Prefix extracts valid UTF-8 prefixes from partial byte sequences; include set expanded to support UTF-8 and string handling.
Buffered Generation and Output
examples/qwen_ascend/main.cpp (lines 223–255)
QA generation path adds a prefill warmup step, decodes tokens into pending_text buffer, validates and prints only complete UTF-8 chunks after each generation step, then ensures a final newline.
Documentation
README.md
Latest News announces Ascend NPU backend support with Qwen3 W8A8 inference; Qwen3-0.6B table entry for W8A8 converted to a hyperlink.

Sequence Diagram

sequenceDiagram
    participant Gen as Token Generator
    participant Buf as UTF-8 Buffer
    participant Val as UTF-8 Validator
    participant Out as Output

    loop Generation Loop
        Gen->>Buf: Decode token → append to pending_text
        Buf->>Val: Check for valid UTF-8 prefix
        Val->>Buf: Return valid prefix + remainder
        Buf->>Out: Print valid UTF-8 chunk
        Buf->>Buf: Keep remainder for next iteration
    end
    Out->>Out: Print final newline
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • oreomaker
  • yirongjie
  • xumengwei

Poem

🐰 I buffer bytes and nibble code,
Keep UTF-8 safe down the road,
Tokens warm up, prefixes play,
I print the whole, not half away,
Hops and bytes — a tidy mode.

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title only mentions README update, but the changes significantly include UTF-8 streaming improvements in main.cpp and tokenization changes, making it incomplete and misleading. Update title to reflect all major changes, e.g., 'fix(ascend): add UTF-8 safe streaming and update README for Qwen3 W8A8 support'
Description check ⚠️ Warning The PR description contains only the template boilerplate with no actual description of changes, missing all details about the modifications made. Add a meaningful description explaining the UTF-8 streaming improvements, prefill warmup step, locale initialization changes, and README updates with relevant context.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/qwen_ascend/main.cpp`:
- Around line 19-35: The helper takeValidUtf8Prefix currently returns empty when
an invalid byte is at the buffer start and never consumes it; update
takeValidUtf8Prefix to detect irrecoverable leading bytes (e.g., a start octet
that would require >4 continuation bytes or otherwise cannot form a valid UTF‑8
sequence) and consume/discard one invalid byte so progress continues, while
still returning empty when nothing valid precedes it; also ensure the final
flush path uses takeValidUtf8Prefix on pending_text (instead of printing
pending_text directly) so only validated UTF‑8 is emitted. Reference: function
takeValidUtf8Prefix and the final tail-flush that prints pending_text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e278d7c-fbd2-4317-b44b-90eb3e3d5fd7

📥 Commits

Reviewing files that changed from the base of the PR and between 729ca4c and 3973c6a.

📒 Files selected for processing (3)
  • README.md
  • examples/qwen_ascend/main.cpp
  • mllm/models/qwen_ascend/tokenization_qwen_ascend.hpp

Comment on lines +19 to +35
std::string takeValidUtf8Prefix(std::string& pending_text) {
auto invalid = utf8::find_invalid(pending_text.begin(), pending_text.end());
if (invalid == pending_text.begin()) {
return {};
}

if (invalid == pending_text.end()) {
std::string ready_text;
ready_text.swap(pending_text);
return ready_text;
}

auto ready_bytes = static_cast<size_t>(std::distance(pending_text.begin(), invalid));
auto ready_text = pending_text.substr(0, ready_bytes);
pending_text.erase(0, ready_bytes);
return ready_text;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In utf8cpp, what does utf8::find_invalid(begin, end) return for (1) an incomplete multi-byte sequence at buffer start and (2) an irrecoverably invalid leading byte?

💡 Result:

For utf8::find_invalid(begin, end), the function returns an iterator pointing to the first invalid octet; if none are found it returns end. It does this by scanning from start and calling utf8::internal::validate_next(result, end) and immediately returning result when the returned error code is not UTF8_OK. [1] (1) Incomplete multi-byte sequence at buffer start: It returns begin (i.e., the iterator to the first byte of the incomplete sequence). validate_next is invoked starting from result==start, and any non-UTF8_OK (including INCOMPLETE_SEQUENCE) causes find_invalid to return the current iterator (the start of that invalid sequence). [1][2] (2) Irrecoverably invalid leading byte: It returns an iterator pointing to that invalid leading byte (the first byte whose validate_next call fails with a non-UTF8_OK error such as INVALID_LEAD). Since find_invalid returns the iterator value at which validate_next fails, it is the position of the bad leading octet. [1][2]

Citations:


🏁 Script executed:

# Check if the file exists and get its size
wc -l examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 101


🏁 Script executed:

# View the takeValidUtf8Prefix helper (lines 19-35)
sed -n '19,35p' examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 604


🏁 Script executed:

# View the tail flush logic (lines 250-260 for context around 252-254)
sed -n '250,260p' examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 451


🏁 Script executed:

# Check how pending_text is used in the loop - search for the streaming context
sed -n '240,270p' examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 981


Handle irrecoverable leading bytes to avoid stalled output and invalid tail flush.

When utf8::find_invalid detects an invalid byte at the buffer start, the helper returns empty without consuming it. An irrecoverably invalid leading byte (e.g., a malformed UTF-8 start octet) will therefore block progress indefinitely—the next iteration finds the same byte, returns empty again, and loops without advancing. Additionally, the final flush at line 252-254 directly prints pending_text without validation, allowing invalid UTF-8 to reach output even though the streaming loop uses the validated takeValidUtf8Prefix.

Discard irrecoverable leading bytes that exceed the maximum UTF-8 sequence length (4 bytes), and apply takeValidUtf8Prefix to the tail flush to ensure output is valid UTF-8.

Proposed fix
 std::string takeValidUtf8Prefix(std::string& pending_text) {
+  if (pending_text.empty()) {
+    return {};
+  }
   auto invalid = utf8::find_invalid(pending_text.begin(), pending_text.end());
   if (invalid == pending_text.begin()) {
-    return {};
+    // Keep short prefixes that may become valid with future bytes.
+    // UTF-8 max code point width is 4 bytes; longer invalid-at-begin likely means malformed lead.
+    if (pending_text.size() <= 4) {
+      return {};
+    }
+    pending_text.erase(0, 1);
+    return {};
   }
@@
-      if (!pending_text.empty()) {
-        fmt::print("{}", pending_text);
-      }
+      auto tail = takeValidUtf8Prefix(pending_text);
+      if (!tail.empty()) {
+        fmt::print("{}", tail);
+      }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/qwen_ascend/main.cpp` around lines 19 - 35, The helper
takeValidUtf8Prefix currently returns empty when an invalid byte is at the
buffer start and never consumes it; update takeValidUtf8Prefix to detect
irrecoverable leading bytes (e.g., a start octet that would require >4
continuation bytes or otherwise cannot form a valid UTF‑8 sequence) and
consume/discard one invalid byte so progress continues, while still returning
empty when nothing valid precedes it; also ensure the final flush path uses
takeValidUtf8Prefix on pending_text (instead of printing pending_text directly)
so only validated UTF‑8 is emitted. Reference: function takeValidUtf8Prefix and
the final tail-flush that prints pending_text.

@chenghuaWang chenghuaWang merged commit 524ac99 into UbiquitousLearning:main Jun 8, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants