Skip to content

FEAT: Add partner integration tests for azure-ai-evaluation red team …#1533

Merged
hannahwestra25 merged 8 commits intomicrosoft:mainfrom
slister1001:partner-integration-tests
Apr 8, 2026
Merged

FEAT: Add partner integration tests for azure-ai-evaluation red team …#1533
hannahwestra25 merged 8 commits intomicrosoft:mainfrom
slister1001:partner-integration-tests

Conversation

@slister1001
Copy link
Copy Markdown
Contributor

…module

Add tests/partner_integration/azure_ai_evaluation/ with contract tests validating PyRIT API stability for the azure-ai-evaluation red team module, which depends on 45+ PyRIT imports across 14 files.

Test coverage includes:

  • PromptChatTarget interface contract (extended by 4 SDK classes)
  • CentralMemory/SQLiteMemory lifecycle (used in RedTeam.init)
  • Data models: Message, MessagePiece, Score, seed models, AttackResult
  • PromptConverter base + 19 specific converters importability
  • Scorer/TrueFalseScorer interface (extended by RAIServiceScorer)
  • Foundry scenario APIs: FoundryScenario, FoundryStrategy, DatasetConfiguration
  • Exception types and retry decorators
  • Import smoke tests for azure-ai-evaluation (skipped if not installed)

Also adds partner-integration-test target to Makefile.

All 84 tests pass with no Azure credentials required.

Description

Tests and Documentation

slister1001 and others added 3 commits March 24, 2026 11:33
…module

Add tests/partner_integration/azure_ai_evaluation/ with contract tests
validating PyRIT API stability for the azure-ai-evaluation red team module,
which depends on 45+ PyRIT imports across 14 files.

Test coverage includes:
- PromptChatTarget interface contract (extended by 4 SDK classes)
- CentralMemory/SQLiteMemory lifecycle (used in RedTeam.__init__)
- Data models: Message, MessagePiece, Score, seed models, AttackResult
- PromptConverter base + 19 specific converters importability
- Scorer/TrueFalseScorer interface (extended by RAIServiceScorer)
- Foundry scenario APIs: FoundryScenario, FoundryStrategy, DatasetConfiguration
- Exception types and retry decorators
- Import smoke tests for azure-ai-evaluation (skipped if not installed)

Also adds partner-integration-test target to Makefile.

All 84 tests pass with no Azure credentials required.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix test_ascii_smuggler_converter_importable to test AsciiSmugglerConverter
  (was incorrectly testing AsciiArtConverter, duplicating parametrized coverage)
- Move module-level asyncio.run(initialize_pyrit_async) to session-scoped fixture
- Remove duplicate TestAttackModels (already covered in test_foundry_contract.py)
- Extract MinimalTarget to module-level helper (was defined 3x inline)
- Add docstring clarifying intentional private API imports in smoke tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@slister1001 slister1001 marked this pull request as ready for review March 24, 2026 18:07
…orts, add seed model structural tests

- Update PromptChatTarget to PromptTarget per PR microsoft#1532 deprecation
- Move ScenarioStrategy import to top-level in test_foundry_contract.py
- Add rationale for explicit inheritance checks in test_import_smoke.py
- Expand seed model tests with structural validation (value, data_type,
  harm_categories, role, metadata, SeedGroup composition)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@hannahwestra25 hannahwestra25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few small comments; will set up the pipeline today!

- Rename test_foundry_contract.py → test_redteam_scenario_contract.py (reviewer F1)
- Add ScenarioStrategy base class test (reviewer F2 import ordering)
- Add inheritance check justification comments (reviewer F3)
- Expand seed model tests with prompt_group_id, sequence, context metadata
  patterns from PR #46151 tool context propagation (reviewer F4)
- Add PromptChatTarget transitional compatibility test (reviewer F5)
- Add MathPromptConverter to converter list + CharSwap naming note
- Add OpenAIChatTarget contract tests
- Create test_auth_contract.py for get_azure_openai_auth
- Create test_orchestrator_contract.py with try/except import pattern
- Expand test_memory_contract.py with 5 memory query method tests
  (get_scenario_results, add_scores_to_memory, get_message_pieces,
  get_prompt_request_pieces, get_conversation)
- Add MessagePiece.prompt_metadata field tests for context extraction

All 119 tests pass. ACA has zero direct PyRIT imports — SDK contract
tests provide transitive coverage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove test_orchestrator_contract.py (orchestrators removed from PyRIT)
- Add _restore_central_memory fixture to prevent state leakage
- Remove redundant 'assert X is not None' after top-level imports;
  convert existence-only tests to local imports so they actually
  validate importability instead of being dead code

104 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
slister1001 and others added 2 commits April 8, 2026 11:22
- Fix _MinimalTarget.send_prompt_async signature to match PyRIT
- Rename test_memory_contract → test_sqlite_memory_contract
- Use sqlite_instance fixture for memory query tests (eliminate boilerplate)
- Add data_type/prompt_group_id/value assertions to context pattern test
- Remove is_objective pattern tests (deprecated PyRIT metadata pattern)
- Rename test_redteam_scenario_contract → test_foundry_scenario_contract

103 tests pass. Pre-commit (ruff-format, ruff-check, trailing-whitespace,
end-of-file-fixer) all pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@hannahwestra25 hannahwestra25 merged commit 25a220f into microsoft:main Apr 8, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants