Skip to content

feat: add whisper_detail method to LLMWhispererClientV2#32

Merged
johnyrahul merged 10 commits intomainfrom
feat/add-whisper-detail-method
Mar 16, 2026
Merged

feat: add whisper_detail method to LLMWhispererClientV2#32
johnyrahul merged 10 commits intomainfrom
feat/add-whisper-detail-method

Conversation

@johnyrahul
Copy link
Contributor

@johnyrahul johnyrahul commented Mar 16, 2026

Summary

  • Add whisper_detail(whisper_hash) method to LLMWhispererClientV2 for retrieving extraction job metadata via the /whisper-detail API endpoint
  • Add unit tests (success and not-found cases) and integration tests for the new method
  • Update README: add test running instructions, update client section, remove outdated legacy client references
  • Remove legacy v1 base URL from sample.env

Test plan

  • Unit tests pass: uv run --group test pytest tests/unit/client_v2_test.py -v -k "whisper_detail"
  • Integration tests: uv run --group test pytest tests/integration/client_v2_test.py -v -k "whisper_detail"

🤖 Generated with Claude Code

johnyrahul and others added 2 commits March 16, 2026 13:19
Add whisper_detail() to retrieve extraction job metadata via the
/whisper-detail API endpoint. Includes unit and integration tests.
Also updates README with running tests instructions and removes
outdated legacy client references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR adds a whisper_detail(whisper_hash) method to LLMWhispererClientV2 that fetches extraction job metadata from the /whisper-detail API endpoint, along with unit and integration tests, README updates, and the removal of the legacy v1 base URL from sample.env.

Key changes:

  • New whisper_detail method with well-structured error handling (empty-body guard, json.JSONDecodeError catch, fallback preview) mirroring the pattern established by whisper_status
  • Unit tests correctly use error_message() and exc.status_code to assert both success and failure paths
  • The integration test_whisper_detail_not_found asserts status_code == 400, but the integration tests have not been run against the live API — if the endpoint returns 404 for unknown hashes (as the webhook endpoint does), this assertion will fail
  • whisper_detail omits the self.logger.error(...) calls present in the analogous whisper_status error branches, reducing log-based observability for the empty-body and non-JSON error paths
  • README cleanup removes the outdated v1 client section and adds clear test-running commands

Confidence Score: 4/5

  • Safe to merge after verifying the live API status code for unknown whisper hashes.
  • The implementation is solid and the previous review concerns have been addressed. The error handling in whisper_detail is correct and mirrors whisper_status. The only remaining concern is the unverified assumption that the live API returns HTTP 400 (not 404) for a nonexistent whisper_hash in the integration test, which has not been run. The missing logging in error paths is a minor observability gap but not a correctness issue.
  • tests/integration/client_v2_test.py — the test_whisper_detail_not_found test's status_code == 400 assertion should be verified against the live API before merging.

Important Files Changed

Filename Overview
src/unstract/llmwhisperer/client_v2.py Adds whisper_detail method with robust error handling (guarded JSON parse, empty-body check, non-JSON fallback) mirroring whisper_status; minor omission of self.logger.error calls in error paths compared to the template method.
tests/unit/client_v2_test.py Adds correct unit tests for both success and not-found cases using error_message() and status_code directly; consistent with the new exception-raising style in whisper_detail.
tests/integration/client_v2_test.py Adds integration tests for whisper_detail; the not-found test hardcodes status_code == 400 but has not been run against the live API — the real endpoint may return a different status code (e.g. 404).
README.md Removes legacy v1 client documentation, promotes LLMWhispererClientV2 as the sole client, and adds clear test-running instructions.
sample.env Removes the deprecated LLMWHISPERER_BASE_URL (v1) entry, leaving only the v2 URL and shared config keys.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant LLMWhispererClientV2
    participant API as LLMWhisperer API

    Caller->>LLMWhispererClientV2: whisper_detail(whisper_hash)
    LLMWhispererClientV2->>API: GET /whisper-detail?whisper_hash=...
    alt HTTP 200
        API-->>LLMWhispererClientV2: JSON metadata dict
        LLMWhispererClientV2-->>Caller: dict (completed_at, mode, pages, etc.)
    else Empty body
        API-->>LLMWhispererClientV2: non-200, empty body
        LLMWhispererClientV2-->>Caller: raise LLMWhispererClientException("API error: empty response body", status_code)
    else Non-JSON response
        API-->>LLMWhispererClientV2: non-200, HTML/text body
        LLMWhispererClientV2-->>Caller: raise LLMWhispererClientException("API error: non-JSON response - ...", status_code)
    else JSON error body
        API-->>LLMWhispererClientV2: non-200, JSON error dict
        LLMWhispererClientV2-->>Caller: raise LLMWhispererClientException(err_dict, status_code)
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/unstract/llmwhisperer/client_v2.py
Line: 357-367

Comment:
**Missing error logging in error paths**

The `whisper_detail` method silently raises exceptions without logging the error conditions, while the template it was modelled after — `whisper_status` (lines 611–621) — calls `self.logger.error(...)` before each raise. This makes it harder to diagnose failures in production logs.

For example, `whisper_status` does:
```python
if not (response.text or "").strip():
    self.logger.error(f"API error - empty response body, status code: {response.status_code}")
    raise LLMWhispererClientException(...)
try:
    err = json.loads(response.text)
except json.JSONDecodeError as e:
    ...
    self.logger.error(f"API error - JSON decode failed: {e}; Response preview: {response_preview!r}")
    raise LLMWhispererClientException(...)
```

Consider adding equivalent `self.logger.error(...)` calls in `whisper_detail` for the empty-body and JSON-decode-failure branches.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/integration/client_v2_test.py
Line: 279

Comment:
**Unverified HTTP status code assumption**

The integration test asserts `status_code == 400` for a nonexistent `whisper_hash`, but the integration tests are explicitly marked as **not run** in the PR description (`[ ] Integration tests`). Looking at the existing `test_webhook` integration test (line 235), the API returns `404` for a not-found webhook, which is more conventional HTTP semantics.

If the real `/whisper-detail` endpoint returns `404` (or any other non-200/non-400 code) for an unknown hash, this assertion will fail at runtime. Worth verifying against the live API before merging.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 6ccfd6d

Handle empty body and non-JSON error responses consistently with
whisper_status, raising LLMWhispererClientException instead of
letting json.JSONDecodeError propagate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Rahul Johny <116638720+johnyrahul@users.noreply.github.com>
johnyrahul and others added 2 commits March 16, 2026 13:35
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Rahul Johny <116638720+johnyrahul@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Rahul Johny <116638720+johnyrahul@users.noreply.github.com>
…sponses

Align error handling with whisper_status pattern: handle empty body,
non-JSON responses, and pass status_code as separate exception argument.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d less brittle

Check for status code and message key presence instead of exact
error string, which may change on the API side.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move whisper_detail tests after usage-sensitive url_in_post tests
to prevent extra extractions from inflating usage counters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_usage\_info}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_v2}}$$ $$\textcolor{#23d18b}{\tt{9}}$$ $$\textcolor{#23d18b}{\tt{9}}$$
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_highlight}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_v2\_url\_in\_post}}$$ $$\textcolor{#23d18b}{\tt{4}}$$ $$\textcolor{#23d18b}{\tt{4}}$$
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_webhook}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_detail}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/integration/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_detail\_not\_found}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_register\_webhook}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_webhook\_details}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_detail\_success}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_detail\_not\_found}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_json\_string\_response\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_json\_string\_response\_202}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_invalid\_json\_response\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_invalid\_json\_response\_202}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retry\_on\_connection\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retry\_on\_timeout}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retry\_on\_429}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retry\_on\_500}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_no\_retry\_on\_400}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_no\_retry\_on\_401}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retries\_exhausted\_raises}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retries\_exhausted\_500\_returns\_response}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_retry\_disabled}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_post\_uses\_min\_of\_api\_timeout\_and\_wait\_timeout}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_whisper\_post\_uses\_wait\_timeout\_when\_smaller}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_send\_request\_deadline\_caps\_timeout}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/unit/client\_v2\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_send\_request\_deadline\_stops\_retries}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_redact\_key\_normal}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_redact\_key\_different\_reveal\_length}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils\_test.py}}$$ $$\textcolor{#23d18b}{\tt{test\_redact\_key\_non\_string\_input}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{42}}$$ $$\textcolor{#23d18b}{\tt{42}}$$

@johnyrahul johnyrahul merged commit f42e662 into main Mar 16, 2026
2 of 5 checks passed
@johnyrahul johnyrahul deleted the feat/add-whisper-detail-method branch March 16, 2026 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants