Skip to content

feat(lvms): add /lvms:run-integration-tests skill for autonomous LVMS testing on TNF#208

Open
lucaconsalvi wants to merge 4 commits into
openshift-eng:mainfrom
lucaconsalvi:feat/lvms-run-integration-tests
Open

feat(lvms): add /lvms:run-integration-tests skill for autonomous LVMS testing on TNF#208
lucaconsalvi wants to merge 4 commits into
openshift-eng:mainfrom
lucaconsalvi:feat/lvms-run-integration-tests

Conversation

@lucaconsalvi

@lucaconsalvi lucaconsalvi commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds /lvms:run-integration-tests, a re-entrant Claude Code skill that
    automates the full LVMS QE integration test pipeline on TNF clusters
  • Designed for RC/EC builds where the redhat-operators OLM catalog does
    not include the lvms-operator package (use /lvms:setup-prereq for
    released builds)
  • Bumps plugin version to 1.2.0

How it works

The skill uses a 3-phase state machine — invoke it once to start, invoke
it again when tests complete to collect results:

  • Phase 1 (fresh): deploys LVMS from source (make deploy), applies
    LVMCluster CR, builds test binary, starts MNO/SNO suite via nohup
  • Phase 2 (running): shows live progress, prints monitoring command, exits
  • Phase 3 (done): parses OTE JSON output, generates Markdown report,
    optionally posts to JIRA via MCP

Test plan

  • Verified end-to-end on OCP 4.22.0 TNF cluster
  • MNO suite: 38/40 passed across two runs (2 failures are cleanup race
    conditions in new upstream tests, not LVMS bugs)
  • All 3 phases exercised: fresh deploy → in-progress detection → results parsing

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added an LVMS command to run end-to-end integration tests (/lvms:run-integration-tests).
  • Documentation

    • Updated the LVMS README with the new command, example invocations, and additional environment prerequisites.
    • Added a new integration-test skill guide covering a re-entrant multi-phase workflow, reporting, and troubleshooting (including log handling/cleanup).
  • Chores

    • Bumped the LVMS plugin version from 1.1.0 to 1.2.0 in the plugin manifest and marketplace metadata.

…ng on TNF

Adds /lvms:run-integration-tests, a re-entrant skill that automates the
full LVMS QE integration test pipeline on TNF clusters. Designed for
RC/EC builds where the redhat-operators OLM catalog does not include the
lvms-operator package.

The skill uses a 3-phase state machine, detecting cluster state on each
invocation:
- Phase 1 (fresh): deploy from source, build binary, start nohup test run
- Phase 2 (running): show progress, print monitoring command, exit
- Phase 3 (done): parse OTE JSON results, generate Markdown report, post to JIRA

Verified end-to-end on OCP 4.22.0 TNF cluster: 38/40 MNO tests passed
across two runs. The 2 failures are cleanup race conditions in new upstream
tests, not LVMS bugs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 25, 2026
@openshift-ci

openshift-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci

openshift-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lucaconsalvi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 25, 2026
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

The PR bumps LVMS plugin metadata and adds README and skill documentation for /lvms:run-integration-tests, including prerequisites, phased execution, result reporting, error handling, and operational notes.

Changes

LVMS integration-test documentation

Layer / File(s) Summary
Metadata and README updates
plugins/lvms/.claude-plugin/plugin.json, .claude-plugin/marketplace.json, plugins/lvms/README.md
The LVMS plugin version changes to 1.2.0, and the README adds /lvms:run-integration-tests entries, usage examples, and added prerequisites.
Skill frontmatter and Phase 1
plugins/lvms/skills/run-integration-tests/SKILL.md
The new skill frontmatter, synopsis, prerequisites, and Phase 1 instructions define the JIRA argument, remote-state checks, deployment steps, readiness checks, and test launch behavior.
Phase 2, Phase 3, and support notes
plugins/lvms/skills/run-integration-tests/SKILL.md
Phase 2 counts partial results, Phase 3 parses final logs and formats reports, and the document adds error handling and operational notes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

ready-for-human-review

🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Ai-Attribution ⚠️ Warning HEAD uses Co-Authored-By: Claude Sonnet 4.6 and lacks the required Assisted-by/Generated-by trailer. Update the commit/PR attribution to a Red Hat-style Assisted-by or Generated-by trailer and remove Co-Authored-By for AI tooling.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding the new /lvms:run-integration-tests skill for autonomous TNF testing.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No-Weak-Crypto ✅ Passed Changed files are docs/JSON; precise searches found no MD5/SHA1/DES/RC4/3DES/BF/ECB or secret/token comparisons, only result-label string checks.
Container-Privileges ✅ Passed PR only touches docs/plugin metadata; no container/K8s manifests or privileged settings are present in the LVMS changes.
No-Sensitive-Data-In-Logs ✅ Passed PASS: The new skill logs only generic progress and trimmed test output; no hardcoded passwords, tokens, PII, or real hostnames appear in the changed files.
No-Hardcoded-Secrets ✅ Passed No hardcoded secrets, credential literals, embedded creds, or long base64 blobs found in the changed files.
No-Injection-Vectors ✅ Passed Changed files are docs/manifests only; no SQL, eval/exec, yaml.load, shell=True, os.system, or unsafe HTML sinks were introduced.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 markdownlint-cli2 (0.22.1)
plugins/lvms/skills/run-integration-tests/SKILL.md

markdownlint-cli2 v0.22.1 (markdownlint v0.40.0)
Finding: plugins/lvms/skills/run-integration-tests/SKILL.md !node_modules/** !two-node-toolbox/**
Linting: 1 file(s)
Summary: 0 error(s)
AggregateError: Unable to import module 'markdownlint-cli2-formatter-pretty'.
at importModule (file:///usr/local/lib/node_modules/markdownlint-cli2/markdownlint-cli2.mjs:90:11)
at async Promise.all (index 0)
at async outputResults (file:///usr/local/lib/node_modules/markdownlint-cli2/markdownlint-cli2.mjs:838:9)
at async main (file:///usr/local/lib/node_modules/markdownlint-cli2/markdownlint-cli2.mjs:1029:5)
at async file:///usr/local/lib/node_modules/markdownlint-cli2/markdownlint-cli2-bin.mjs:14:22 {
[errors]: [
Error: Cannot find module 'markdownlint-cli2-formatter-pretty'
Require stack:
- /usr/local/lib/node_modules/markdownlint-cli2/node_modules/markdownlint/lib/resolve-module.cjs
at Module._resolveFilename (node:internal/modules/cjs/loader:1476:15)

... [truncated 1098 characters] ...

node:internal/modules/esm/resolve:271:11)
at moduleResolve (node:internal/modules/esm/resolve:861:10)
at defaultResolve (node:internal/modules/esm/resolve:988:11)
at #cachedDefaultResolve (node:internal/modules/esm/loader:697:20)
at #resolveAndMaybeBlockOnLoaderThread (node:internal/modules/esm/loader:714:38)
at ModuleLoader.resolveSync (node:internal/modules/esm/loader:746:52)
at #resolve (node:internal/modules/esm/loader:679:17)
at ModuleLoader.getOrCreateModuleJob (node:internal/modules/esm/loader:599:35)
at node:internal/modules/esm/loader:628:32
at TracingChannel.tracePromise (node:diagnostics_channel:362:14) {
code: 'ERR_MODULE_NOT_FOUND',
url: 'file:///markdownlint-cli2-formatter-pretty'
}
]
}


Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
plugins/lvms/skills/run-integration-tests/SKILL.md (1)

341-353: 🎯 Functional Correctness | 🔵 Trivial | ⚖️ Poor tradeoff

Error handling in a separate section instead of inline with each step.

Per repo learnings, failure policies and edge-case rules must be co-located with the specific step they govern — a separate ## Error Handling table is not reliably re-consulted by the LLM mid-execution.

Move each row's action into a CRITICAL or Error Handling block immediately after the relevant step (e.g., the make deploy failure note directly after Step 1c's make deploy command, LVMCluster timeout note directly after the wait loop, etc.).

Based on learnings from ggiguash (PR #192): "co-locate failure policies and edge-case rules inline with the specific step that needs them... Do not move these rules to a separate top-level 'Edge Cases' (or similar) section."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/lvms/skills/run-integration-tests/SKILL.md` around lines 341 - 353,
The `run-integration-tests` skill keeps failure handling in a separate `## Error
Handling` table, but the review asks to co-locate each rule with the step it
governs. Move each action inline into the relevant step’s CRITICAL/Error
Handling block in the main procedure (for example, the `make deploy` guidance
near Step 1c, the LVMCluster timeout note near the wait loop, and the SSH/nohup
and JIRA notes near their respective steps), and remove the standalone table so
the rules stay attached to the matching step.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/lvms/.claude-plugin/plugin.json`:
- Line 4: The plugin version was bumped in plugin.json but the corresponding
marketplace version was not updated. When you change the version in the
.claude-plugin metadata, make the same version bump in marketplace.json as well,
using the plugin.json and marketplace.json entries to keep both files in sync so
update detection works correctly.

In `@plugins/lvms/README.md`:
- Around line 57-63: The README Requirements section is missing the
prerequisites needed to use run-integration-tests. Update the Requirements
content in plugins/lvms/README.md to include the same prerequisite set described
in SKILL.md so users see them before invoking /lvms:run-integration-tests: Go
1.24 or newer, access to a TNF cluster with extra disks, and SSH access to the
hypervisor. Keep the wording aligned with the existing README structure and the
run-integration-tests command examples.

In `@plugins/lvms/skills/run-integration-tests/SKILL.md`:
- Line 44: The default SSH host prompt in the run-integration-tests skill is
using a real hardcoded IP, which should be replaced with a generic placeholder.
Update the prompt text in SKILL.md for the Hypervisor SSH host default so it no
longer references `ec2-user@52.29.221.136`, and use a neutral example value
instead. Keep the change localized to the prompt string so the surrounding
workflow remains unchanged.
- Line 6: The run-integration-tests skill is missing the MCP Jira comment tool
from its allowed-tools list, so the JIRA posting step in the workflow will not
be permitted. Update the skill’s allowed-tools declaration to include the MCP
Atlassian Jira commenting tool referenced by the workflow instruction that calls
mcp__mcp-atlassian__jira_add_comment, keeping the existing Bash, Read, and
AskUserQuestion entries intact.
- Around line 155-159: The CSI driver verification step currently says not to
proceed but does not explicitly stop on failure; update the
run-integration-tests skill text around the CSI check to add a clear failure
path and message. In the section that runs the topolvm CSI driver check, make it
explicit that if topolvm.io is absent the process must stop, report that the CSI
driver is not registered, and point to the vg-manager logs for troubleshooting;
ensure the wording matches the SKILL-GUIDELINES requirement for an explicit stop
condition.
- Around line 226-235: The Phase 2 log check is using a hardcoded /root path
that does not match the ec2-user home used earlier, so the reader in the
integration-test flow will never find the log. Update the log-read step in
run-integration-tests SKILL.md to use the same home-relative path as Phase 1
(the SSH_HOST/ec2-user setup and the lvms-mno.log reference) so the Python
snippet reads the existing file consistently.

---

Nitpick comments:
In `@plugins/lvms/skills/run-integration-tests/SKILL.md`:
- Around line 341-353: The `run-integration-tests` skill keeps failure handling
in a separate `## Error Handling` table, but the review asks to co-locate each
rule with the step it governs. Move each action inline into the relevant step’s
CRITICAL/Error Handling block in the main procedure (for example, the `make
deploy` guidance near Step 1c, the LVMCluster timeout note near the wait loop,
and the SSH/nohup and JIRA notes near their respective steps), and remove the
standalone table so the rules stay attached to the matching step.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5000a9dc-5292-4baa-b4c3-6be578fc1773

📥 Commits

Reviewing files that changed from the base of the PR and between 6d5fd24 and 730e2fd.

📒 Files selected for processing (3)
  • plugins/lvms/.claude-plugin/plugin.json
  • plugins/lvms/README.md
  • plugins/lvms/skills/run-integration-tests/SKILL.md

Comment thread plugins/lvms/.claude-plugin/plugin.json
Comment thread plugins/lvms/README.md
Comment thread plugins/lvms/skills/run-integration-tests/SKILL.md Outdated
Comment thread plugins/lvms/skills/run-integration-tests/SKILL.md Outdated
Comment thread plugins/lvms/skills/run-integration-tests/SKILL.md Outdated
Comment thread plugins/lvms/skills/run-integration-tests/SKILL.md
lucaconsalvi and others added 2 commits June 25, 2026 14:45
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Add mcp__mcp-atlassian__jira_add_comment to allowed-tools
- Replace hardcoded IP with generic placeholder in SSH host prompt
- Fix /root/ path to ~/  in Phase 2 and Phase 3 log reads
- Add explicit STOP condition when topolvm.io CSI driver is missing
- Add Go 1.24+ and TNF cluster requirements to README

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai coderabbitai Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 25, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/lvms/skills/run-integration-tests/SKILL.md (1)

346-358: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Inline the failure handling with the relevant steps.

Per SKILL-GUIDELINES, failure policies and edge cases should live next to the step they govern. This top-level Error Handling table creates a second source of truth and will drift from the phase instructions.

Based on learnings, co-locate failure policies and edge-case rules inline with the specific step that needs them, and do not move them to a separate top-level Error Handling section.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/lvms/skills/run-integration-tests/SKILL.md` around lines 346 - 358,
The top-level Error Handling table in the integration-test skill should be
removed because it duplicates step-level policy and will drift from the
workflow. Inline each failure policy next to the relevant step in the existing
phase sections of SKILL.md, using the same symbols and step names already
present there (for example, the SSH setup/run steps, make deploy, make
integration-build, LVMCluster readiness checks, vg-manager log checks, and JIRA
reporting), and keep the retry/remediation guidance attached to the step it
governs rather than as a separate section.

Sources: Path instructions, Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@plugins/lvms/skills/run-integration-tests/SKILL.md`:
- Around line 346-358: The top-level Error Handling table in the
integration-test skill should be removed because it duplicates step-level policy
and will drift from the workflow. Inline each failure policy next to the
relevant step in the existing phase sections of SKILL.md, using the same symbols
and step names already present there (for example, the SSH setup/run steps, make
deploy, make integration-build, LVMCluster readiness checks, vg-manager log
checks, and JIRA reporting), and keep the retry/remediation guidance attached to
the step it governs rather than as a separate section.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d697c8f8-6829-4571-8d99-2944000cee68

📥 Commits

Reviewing files that changed from the base of the PR and between e1814a5 and 6352d35.

📒 Files selected for processing (2)
  • plugins/lvms/README.md
  • plugins/lvms/skills/run-integration-tests/SKILL.md
✅ Files skipped from review due to trivial changes (1)
  • plugins/lvms/README.md

@lucaconsalvi lucaconsalvi marked this pull request as ready for review June 25, 2026 14:13
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 25, 2026
Add language specifiers to bare fenced code blocks (MD040) and
blank lines around fences (MD031).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/lvms/skills/run-integration-tests/SKILL.md (1)

232-241: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Tighten the Phase 2 fallback.

except: turns every failure into "log not yet written", which hides permission/encoding issues and makes the progress state misleading.

Proposed fix
- except: print('log not yet written')
+ except FileNotFoundError:
+   print('log not yet written')
+ except Exception as e:
+   print(f'log read failed: {e}')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/lvms/skills/run-integration-tests/SKILL.md` around lines 232 - 241,
The Phase 2 log check in the integration-test script uses a blanket except that
turns all failures into “log not yet written,” masking real errors. Update the
fallback in the ssh/python3 snippet to catch only the expected missing-file case
and let permission, decode, or other read failures surface with a clearer
message. Keep the fix localized to the log parsing block that reads
~/lvms-mno.log and counts result fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@plugins/lvms/skills/run-integration-tests/SKILL.md`:
- Around line 232-241: The Phase 2 log check in the integration-test script uses
a blanket except that turns all failures into “log not yet written,” masking
real errors. Update the fallback in the ssh/python3 snippet to catch only the
expected missing-file case and let permission, decode, or other read failures
surface with a clearer message. Keep the fix localized to the log parsing block
that reads ~/lvms-mno.log and counts result fields.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b1485f83-d9d5-4848-ab6c-703eadc2ad74

📥 Commits

Reviewing files that changed from the base of the PR and between 6352d35 and 7e7d714.

📒 Files selected for processing (1)
  • plugins/lvms/skills/run-integration-tests/SKILL.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant