Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,9 @@

# Lint output
/report.xml

# E2E tests
/e2e-tests/.env
/e2e-tests/mcp-reports/
/e2e-tests/bin/
/e2e-tests/**/*-out.json
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ helm-lint: ## Run helm lint for Helm chart
test: ## Run unit tests
$(GOTEST) -v ./...

.PHONY: e2e-test
e2e-test: ## Run E2E tests
@cd e2e-tests && ./scripts/run-tests.sh

.PHONY: test-coverage-and-junit
test-coverage-and-junit: ## Run unit tests with coverage and junit output
go install github.com/jstemmer/go-junit-report/[email protected]
Expand Down
92 changes: 92 additions & 0 deletions e2e-tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# StackRox MCP E2E Testing

End-to-end tests for the StackRox MCP server using [gevals](https://github.com/genmcp/gevals).

## Prerequisites

- Go 1.25+
- OpenAI API Key (for AI agent and LLM judge)
- StackRox API Token

## Setup

### 1. Build gevals

```bash
cd e2e-tests
./scripts/build-gevals.sh
```

### 2. Configure Environment

Create `.env` file:

```bash
OPENAI_API_KEY=<OpenAI Key>
STACKROX_MCP__CENTRAL__API_TOKEN=<StackRox API Token>
```

## Running Tests

```bash
./scripts/run-tests.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see, this script includes also building of the gevals if it's not already built. In short we can set environment variables and call this script.

Can we add make target that will run e2e-tests? (more or loss just calls this script, as I understand)

```

Results are saved to `gevals-stackrox-mcp-e2e-out.json`.

### View Results

```bash
# Summary
jq '.tasks[] | {name, passed}' gevals-stackrox-mcp-e2e-out.json

# Tool calls
jq '.tasks[].callHistory[] | {toolName, arguments}' gevals-stackrox-mcp-e2e-out.json
Comment on lines +41 to +44
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jq commands do not work for me.

  1. I'm getting:
jq: error: Could not open file gevals-stackrox-mcp-e2e-out.json: No such file or directory

(we are in cd e2e-tests directory - and file is in gevals/gevals-stackrox-mcp-e2e-out.json)

  1. after fixing that:
jq: error (at gevals/gevals-stackrox-mcp-e2e-out.json:1359): Cannot index array with string "tasks"

(there is not JSON property tasks, we have list of objects)

  1. after fixing that, I'm not getting errors any more, but results are wrong:
{
  "name": null,
  "passed": null
}

(keys are taskName and taskPassed)

  1. after fixing that -> I get reasonable results:
jq '.[] | {taskName, taskPassed}' gevals/gevals-stackrox-mcp-e2e-out.json

I didn't test jq '.tasks[].callHistory[] - but it probably has similar issues.

```

## Test Cases

| Test | Description | Tool |
|------|-------------|------|
| `list-clusters` | List all clusters | `list_clusters` |
| `cve-detected-workloads` | CVE detected in deployments | `get_deployments_for_cve` |
| `cve-detected-clusters` | CVE detected in clusters | `get_clusters_with_orchestrator_cve` |
| `cve-nonexistent` | Handle non-existent CVE | `get_clusters_with_orchestrator_cve` |
| `cve-cluster-does-exist` | CVE with cluster filter | `get_clusters_with_orchestrator_cve` |
| `cve-cluster-does-not-exist` | CVE with cluster filter | `get_clusters_with_orchestrator_cve` |
| `cve-clusters-general` | General CVE query | `get_clusters_with_orchestrator_cve` |
| `cve-cluster-list` | CVE across clusters | `get_clusters_with_orchestrator_cve` |

## Configuration

- **`gevals/eval.yaml`**: Main test configuration, agent settings, assertions
- **`gevals/mcp-config.yaml`**: MCP server configuration
- **`gevals/tasks/*.yaml`**: Individual test task definitions

## How It Works

Gevals uses a proxy architecture to intercept MCP tool calls:

1. AI agent receives task prompt
2. Agent calls MCP tool
3. Gevals proxy intercepts and records the call
4. Call forwarded to StackRox MCP server
5. Server executes and returns result
6. Gevals validates assertions and response quality

## Troubleshooting

**Tests fail - no tools called**
- Verify StackRox Central is accessible
- Check API token permissions

**Build errors**
```bash
go mod tidy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please check this.

For me go mod tidy adds:

	github.com/genmcp/gevals v0.0.1
	github.com/google/jsonschema-go v0.3.0

to direct dependencies. I'm on go1.25.4.

./scripts/build-gevals.sh
```

## Further Reading

- [Gevals Documentation](https://github.com/genmcp/gevals)
- [StackRox MCP Server](../README.md)
101 changes: 101 additions & 0 deletions e2e-tests/gevals/eval.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
kind: Eval
metadata:
name: "stackrox-mcp-e2e"
config:
agent:
type: "builtin.openai-agent"
model: "gpt-4o"
llmJudge:
env:
baseUrlKey: JUDGE_BASE_URL
apiKeyKey: JUDGE_API_KEY
modelNameKey: JUDGE_MODEL_NAME
mcpConfigFile: mcp-config.yaml
taskSets:
# Test 1: List clusters
- path: tasks/list-clusters.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "list_clusters"
minToolCalls: 1
maxToolCalls: 1

# Test 2: CVE detected in workloads
- path: tasks/cve-detected-workloads.yaml
assertions:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to define assertions in tasks file? i.e. in for this case tasks/cve-affecting-workloads.yaml

toolsUsed:
- server: stackrox-mcp
toolPattern: "get_deployments_for_cve"
argumentsMatch:
cveName: "CVE-2021-31805"
minToolCalls: 1
maxToolCalls: 1

# Test 3: CVE detected in clusters - basic
- path: tasks/cve-detected-clusters.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2016-1000031"
minToolCalls: 1
maxToolCalls: 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What maxToolCalls means? Tools defined in toolsUsed can be called up to 3 times?


# Test 4: Non-existent CVE
# Expects 3 calls because "Is CVE detected in my clusters?" triggers comprehensive check
# (orchestrator, deployments, nodes). The LLM cannot know beforehand if CVE exists.
- path: tasks/cve-nonexistent.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2099-00001"
minToolCalls: 1
maxToolCalls: 3

# Test 5: CVE with specific cluster filter (does exist)
- path: tasks/cve-cluster-does-exist.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "list_clusters"
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2016-1000031"
minToolCalls: 1
maxToolCalls: 2

# Test 6: CVE with specific cluster filter (does not exist)
- path: tasks/cve-cluster-does-not-exist.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "list_clusters"
minToolCalls: 1
maxToolCalls: 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM should not fetch list of clusters twice:

Suggested change
maxToolCalls: 2
maxToolCalls: 1


# Test 7: CVE detected in clusters - general
- path: tasks/cve-clusters-general.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2021-31805"
minToolCalls: 1
maxToolCalls: 5

# Test 8: CVE check with cluster list reference
- path: tasks/cve-cluster-list.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2024-52577"
minToolCalls: 1
maxToolCalls: 5
Loading
Loading