Agentic Coding Guide

Tests are the agent's way of proving its own work. Without them, you're the verification step — and the loop only moves as fast as you do.

Test behavior, not implementation

LLM-generated tests tend to mirror the implementation — testing what the code does, not what the product requires. This produces tests that pass when they should fail.

Write tests that describe outcomes from the user or system's perspective
Avoid asserting on internal state or implementation details — if behavior didn't change, tests shouldn't break
Ask the LLM "What would a QA engineer test here?" rather than "Write tests for this function"
Describe the contract (given these inputs, these outputs are guaranteed) and test that

Unit tests

Fast, isolated, the first feedback signal that runs in CI.

Test pure functions aggressively — easy to test and agents tend to generate them
Mock at the boundary — mock external services, not internal functions
One assertion per test — tests that assert many things are hard to diagnose when they fail
Test edge cases explicitly — agents often forget null, empty, and boundary cases
Name tests as specifications — it("returns empty array when user has no orders") not it("works")

Coverage: high on business logic, lower on glue code. Use mutation testing (Stryker, mutmut) to verify tests actually catch bugs.

E2E tests

Slower and more brittle, but catch a different class of bugs — full system integration.

Test critical paths only in CI — login, checkout, core workflows
Test from the user's perspective — click buttons, fill forms, read outcomes
Stable selectors — use data-testid, not CSS classes or XPath
Idempotent setup — each test sets up and tears down its own state

Tools: Playwright (recommended), Cypress, Selenium

Agentic QA

Use an agent to QA features the way a human tester would — browse the UI, interact with it, check for regressions, inspect network traffic, and report findings.

Tools:

agent-browser CLI — headless browser the agent controls directly
Playwright MCP — full browser automation via MCP tool calls
Chrome DevTools MCP — console errors, network requests, performance profiles, DOM state
Maestro (mobile / React Native) — declarative YAML flows at the native view-hierarchy level; agents can author and iterate on flows directly

An agent QA workflow: open the feature, walk through the user flow, take screenshots at key steps, check console for errors, inspect network requests, report findings with specific failure details.

Where this beats traditional E2E scripts:

No brittle selectors — the agent finds elements by label or visual context
Exploratory testing — give the agent a feature description and let it try to break it
Natural language assertions — "confirm the total updates when the quantity changes" is the test
Debugging included — when something fails, the agent explains why

Agentic QA is the difference between "the tests pass" and "a human-equivalent tester tried to use it and here's what they found."

Testing & QA

Test behavior, not implementation

Unit tests

E2E tests

Agentic QA

On this page