Vitest Integration
Run Agentest scenarios as vitest tests for IDE integration, describe/it blocks, and familiar test output.
Why Use Vitest Integration?
- IDE support — run individual scenarios from your editor (VS Code, WebStorm)
- Familiar output — vitest's test reporter,
--watch,--reporterflags - Custom assertions — use
expect()on scenario results for fine-grained checks - Unified test suite — agent tests alongside unit tests in the same
vitest run
Quick Setup
defineSimSuite
The simplest way to run all scenarios as a vitest test:
// tests/agent.test.ts
import { defineSimSuite } from 'agentest/vitest'
defineSimSuite({
agent: { name: 'my-agent', endpoint: 'http://localhost:3000/api/chat' },
})npx vitest runOutput:
✓ Agentest > runs all scenarios (45s)
Test Files 1 passed (1)
Tests 1 passed (1)defineSimSuite discovers all .sim.ts files, runs them, and fails the vitest test if any scenario fails — using the same pass/fail logic as the CLI.
Options
defineSimSuite(config, {
scenario: 'booking', // filter scenarios by name (substring match)
timeout: 180_000, // per-test timeout in ms (default: 120_000)
cwd: './packages/agent', // working directory for scenario discovery
})Timeout: Agent simulations involve multiple LLM calls and can take 30-120 seconds per scenario. Set the timeout high enough to avoid false failures. The default is 120 seconds.
Single Scenario Testing
For more granular control, use runScenario to test individual scenarios with custom assertions:
import { runScenario } from 'agentest/vitest'
import { defineConfig } from 'agentest'
import { describe, it, expect } from 'vitest'
const config = defineConfig({
agent: { name: 'my-agent', endpoint: 'http://localhost:3000/api/chat' },
})
describe('booking agent', () => {
it('completes the booking goal', async () => {
const result = await runScenario(config, 'user books a morning slot')
expect(result.passed).toBe(true)
}, 120_000)
it('handles errors without critical failures', async () => {
const result = await runScenario(config, 'user tries unavailable slot')
const criticalErrors = result.errors.filter(e => e.severity === 'critical')
expect(criticalErrors).toHaveLength(0)
}, 120_000)
it('achieves high helpfulness', async () => {
const result = await runScenario(config, 'user books a morning slot')
expect(result.avgScores.helpfulness).toBeGreaterThan(3.5)
}, 120_000)
})What runScenario Returns
runScenario returns a ScenarioSummary with:
interface ScenarioSummary {
passed: boolean
totalConversations: number
passedConversations: number
failedConversations: number
avgScores: Record<string, number> // { helpfulness: 4.2, coherence: 4.8, ... }
errors: UniqueError[] // deduplicated errors with severity
thresholdViolations: string[] // which thresholds were breached
}Failure Output
When a scenario fails, vitest shows a clear error message:
✗ Agentest > runs all scenarios
AssertionError: Scenario "user books a morning slot" failed:
- Trajectory assertion failed in 1 conversation (missing: create_booking)
- helpfulness: 3.2 (threshold: 3.5)
- 1 critical error: Agent leaked API key in response
Conversations: 2/3 passedCombining with the CLI
Vitest integration and the CLI are complementary:
CLI (npx agentest run) | Vitest (npx vitest run) | |
|---|---|---|
| Best for | CI pipelines, quick iteration | IDE integration, mixed test suites |
| Reporters | Console, JSON, GitHub Actions | Vitest reporters |
| Watch mode | --watch flag | Vitest's built-in watch |
| Filtering | --scenario "name" | Vitest's --grep or .only |
| Comparison mode | Supported | Use defineSimSuite with comparison config |
You don't need to choose — use both. The CLI for CI and quick runs, vitest for IDE-driven development.
Tips
- Always set explicit timeouts on agent tests. The default vitest timeout (5s) is way too short.
- Use
describe.concurrentif you want vitest to run multiple scenario tests in parallel. - Filter with
.onlyduring development:it.only('booking flow', ...)to run just one scenario. - Keep agent tests in a separate directory (e.g.,
tests/agent/) if you want to run them separately from unit tests:npx vitest run tests/agent/.
Next Steps
- Getting Started — CLI-based quick start
- Pass/Fail Logic — What makes scenarios pass or fail
- CLI Reference — CLI options and reporters