Skip to content

Conversation

@rafallezanko
Copy link
Contributor

@rafallezanko rafallezanko commented Jan 19, 2026

https://learn.microsoft.com/en-us/javascript/api/microsoft-cognitiveservices-speech-sdk/propertyid?view=azure-node-latest

SpeechServiceResponse_PostProcessingOption = 39 | A string value specifying which post processing option should be used by service. Allowed values are "TrueText". Added in version 1.7.0

Summary by CodeRabbit

  • New Features
    • Added an optional TrueText post‑processing toggle for the Azure speech‑to‑text plugin. When enabled per instance at initialization, transcriptions receive TrueText formatting/cleanup to improve readability and accuracy. Disabled by default and can be turned on for individual speech‑to‑text instances to enhance output quality.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

Adds a boolean true_text_post_processing option to the Azure STT plugin: exposed on STT.__init__, stored in STTOptions, and applied in _create_speech_recognizer() by setting the SpeechConfig post-processing option to "TrueText" when enabled.

Changes

Cohort / File(s) Summary
Azure STT plugin
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
Added true_text_post_processing: bool = False to STTOptions; added true_text_post_processing parameter to STT.__init__ and propagated it into STTOptions; updated _create_speech_recognizer() to set SpeechConfig's post-processing option to "TrueText" when the flag is truthy.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I added a switch with a joyful hop,
TrueText whispers and cleans each drop.
A tiny flag to guide the stream,
Words made tidy, neat, and gleam. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately and specifically describes the main change: adding the TrueText post-processing option to Azure STTOptions, which is the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f363d24 and c09cfdd.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit-tests

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py (1)

407-413: Use the correct PropertyId enum: SpeechServiceResponse_PostProcessingOption instead of PostProcessingOption.

The enum speechsdk.enums.PropertyId.PostProcessingOption does not exist in the Azure Speech SDK (1.43.0+). The correct enum name is SpeechServiceResponse_PostProcessingOption (ID: 4003). The current code will raise an AttributeError at runtime.

Fix
-        speech_config.set_property(speechsdk.enums.PropertyId.PostProcessingOption, "TrueText")
+        speech_config.set_property(
+            speechsdk.enums.PropertyId.SpeechServiceResponse_PostProcessingOption,
+            "TrueText",
+        )
🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py (1)

83-102: Document the new option in the constructor docstring.

The new public parameter isn’t described yet, which makes the API harder to discover.

📌 Suggested docstring update
@@
         Args:
             phrase_list: List of words or phrases to boost recognition accuracy.
                         Azure will give higher priority to these phrases during recognition.
             explicit_punctuation: Controls punctuation behavior. If True, enables explicit punctuation mode
                         where punctuation marks are added explicitly. If False (default), uses Azure's
                         default punctuation behavior.
+            true_text_post_processing: Enables Azure "TrueText" post-processing in the recognition result.

As per coding guidelines, maintain Google-style docstrings for public APIs.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0722371 and bdf13ad.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
  • GitHub Check: unit-tests
🔇 Additional comments (2)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py (2)

41-62: LGTM: default-off option is sensible.

The new true_text_post_processing field is a safe, backward-compatible addition.


141-157: LGTM: option is correctly propagated into STTOptions.

Wiring looks consistent with the other options.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@rafallezanko rafallezanko force-pushed the main branch 2 times, most recently from fcd3953 to f4346aa Compare January 19, 2026 15:49
@rafallezanko
Copy link
Contributor Author

Hi @chenghao-mou , can you have a look at the MR? Thank you in advance :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant