Skip to main content

What should a well-designed agentic test do?

  • May 4, 2026
  • 2 replies
  • 209 views

GiannisPap

 Take on the challenge for a chance to win! The lucky winner will walk away with a gift box from ShiftSync!🎁

👉 An AI agent is asked to test the Login flow on a neobank app.

It finds the username field, enters credentials, but then the OTP screen appears unexpectedly.
What should a well-designed agentic test do?
Drop your answers in the comments. 

🕒 Submissions: Submit within one week after the conference ends.

2 replies

  • Space Cadet
  • May 13, 2026

A  well-designed agentic test should Detect → Classify → Handle → Report, with two critical rules:

Handle doesn't mean ignore. Must flag it and escalate the verdict, never silently adapt (potential bug).

Handle also requires a pre-defined handler. If none exists, suspend with context, not improvise. An agent that workarounds unknown states is no longer testing the app, it's testing itself.


سامان ذوالفقاریان
Forum|alt.badge.img+4

A well-designed agentic test should exhibit adaptability, state-awareness, and intelligent decision-making when encountering dynamic elements like an unexpected OTP screen. Instead of failing like a traditional rigid script, it should do the following:

Context-Aware Recognition: The agent must dynamically recognize the OTP screen as a multi-factor authentication (MFA) state shift, understanding that this is a functional security layer rather than an application crash or an element-not-found error.

Intelligent Pathway Resolution: A robust testing agent should be integrated with the test environment's infrastructure. It should know how to autonomously retrieve the OTP—whether by calling a mock API, checking a designated virtual SMS/Email gateway, or querying a staging database—and complete the flow.

Graceful Fallback & Semantic Logging: If the OTP retrieval pathway is not configured or requires manual human intervention, the agent should not just time out. It must gracefully pause, capture the full semantic context and state history, and log a descriptive report (e.g., "MFA wall reached; awaiting verification token").

State Maintenance: It should preserve the session integrity so that once the token is provided (either via an automated mock or manual input), it can resume execution seamlessly without restarting the entire login lifecycle.