Skip to main content

AI Agents in Testing: The 5W1H Framework [Part 2]

  • January 19, 2026
  • 7 replies
  • 172 views
parwalrahul
Forum|alt.badge.img+4

In the previous chapter, I introduced the 5W1H AI agent framework and explored What AI agents are, How they work in testing contexts, and Why they matter.
Now, we will dive deeper into this framework to understand When to use them, Where they fit, and Who provides agentic capabilities.

When to Use Which Level of Agent?

Agents come in various levels.  Each level solves a different type of problem. Some follow fixed rules. Some help you think. Some run tasks on their own. You need to select the level that fits your workload, risk, and control needs. 

This section will break down what each level does and when you should use it.

Level 1: Rule-Based Systems (Traditional Agents)

These are traditional automation agents that operate on predefined rules and logic. You can also call them as “Pre-LLM era” agents or traditional agents.

Built using: Programmable rules. Fixed conditions. If-else logic. 

Behavior:

  • Executes tasks exactly as programmed
  • No reasoning capabilities (as they are not built using LLMs or Gen AI)

Examples:

  • Auto-close JIRA tickets when status stays unchanged for a set duration
  • Rename automation screenshots using Test Script Name + Screen Name + Timestamp
  • Validate API responses using a given JSON schema
  • Move test-reports files to the archive automatically after a given time

Pros: Fast. Predictable. Reliable

Cons: Cannot handle ambiguity. Can only work when rules are fully known upfront.

Autonomy: Low. Cannot decide anything except what is clearly defined.

Control: Low. They operate according to their defined rules. Human control is not required

Level 2: Workflow Agents

This level provides reasoning support without giving up human control. These agents help you think, organize, and draft. They do not act independently. You stay in the loop at every step.

Built using: LLM-based reasoning, Human-driven approvals, Structured workflows

Behavior:

  • Assists with analysis and thinking-related tasks
  • Produces drafts or offers insights on existing drafts
  • Works step-by-step with human supervision

Examples:

  • Summarizing requirement documents into testable points
  • Generating initial test ideas from user stories
  • Drafting bug reports with structured details
  • Parse execution logs and identify likely failure categories

Pros: Low risk, Clear human oversight, Immediate productivity boost

Cons: Cannot act independently. Requires frequent review.

Autonomy: Low. Decisions remain with the human-in-the-loop.

Control: High. Humans approve every meaningful outcome.

Level 3. Semi-autonomous agents

This level moves from assistance to execution. These agents can run tasks end to end. They make decisions within defined boundaries. You step in only at key moments.

Built using: LLM-based reasoning. Tool calling. Memory for context retention. Planning modules for task sequencing.

Behavior:

  • Runs multi-step tasks
  • Uses tools to gather and process data
  • Requests human approval at defined checkpoints

Examples:

  • Test reporting agent that collects execution results and highlights trends
  • A regression analysis agent that selects tests based on code changes
  • Requirement review agent that flags contradictions and unclear areas

Pros: Higher impact than workflow agents. Saves substantial time on repeated tasks. Useful for well-understood and familiar problem statements.

Cons: Harder to debug when errors occur. Requires a stable tool and system setup. Needs clear expectations and strict boundaries.

Autonomy: High. Can make decisions within the assigned scope.

Control: Moderate. Human oversight exists at key decision points.

Level 4. Autonomous agents

This is the highest level of autonomy. These agents work with minimal human involvement. They decide, act, and recover on their own. You intervene only when something goes wrong.

Built using: Full agent runtime. Tool orchestration across systems. Memory for long-running context. Multi-step planning with retries and recovery.

Behavior:

  • Works independently
  • Runs long and complex workflows
  • Escalates to humans only for exception cases.

Examples:

  • Autonomous smoke testing of critical end-to-end flows.
  • Autonomous diagnosis of production issues and automatic ticket creation.

Pros: Scalable across systems and teams. Handles complex, interconnected tasks.

Cons: Hard to evaluate. High risk if boundaries are unclear. Sensitive to hallucinations and misinterpretations.

Autonomy: High. Can make decisions without human checkpoints.

Control: Low. Human involvement is reactive and subject to exceptional triggers rather than continuous.

Where Should You Add AI Agents First?

Most agents fail because implementors want to solve large-scale big and complex problems with AI agents. These tools are best for simple and well-understood problems. It’s best to start with the identification of low-hanging problems. 

Pro tip: Target tasks where repetition is highest, and the judgment is lowest.

Here are a few examples of such tasks from my day-to-day work. 

  • Requirement consistency reviews
  • Boundary analysis
  • Bulk test data generation
  • Bug reporting
  • Flaky test detection
  • Test execution summarization
  • Risk identification

This list is just for inspiration. Once you introspect on your day-to-day work, you can also find similar tasks in your work areas.

Start with Level 1 (rule-based) agents, and then gradually go to Level 2 (workflow) and Level 3 (semi-autonomous) agents. In my experience, I have found good efficiency with the Level 2 (workflow) and Level 3 (semi-autonomous) agents.

Who Provides Agentic Capabilities?

Agentic AI is a buzzing market. New tools are coming up daily. In this section we will try to see some popular Agentic AI tools from each level of such tools. Use this as a starting point and keep exploring more tools as you understand agentic AI better. 

Pro tip: The ShiftSync platform encourages discussions about Agentic AI for testers. Use this platform to discuss your doubts and experiments with the agentic AI tools.

Level 1: Rule-Based Systems (Traditional Agents)

  1. Tricentis Tosca: A model-based testing platform that executes deterministic workflows using reusable modules. Excellent for predictable, repeatable checks and structured automation pipelines.
  2. Tricentis Data Integrity: Validates large-scale data pipelines by comparing source and target systems, detecting mismatches, and enforcing transformation rules. Useful for ETL, BI, and analytics environments.
  3. Selenium: A popular browser automation framework that executes scripted actions. Reliable for fixed UI flows where selectors and behavior remain stable.
  4. GitLab Pipelines: Runs predefined CI/CD tasks with scripted logic. Useful for test orchestration, artifact movement, and automated build workflows.

Level 2: Workflow Agents

  1. Tricentis Copilot: Summarizes requirements, drafts test-relevant insights, and helps testers break down documents into actionable points. Perfect for requirement engineering.
  2. Tricentis qTest Insights (AI Summaries): Processes execution results and compresses large sets of logs, failures, and trends into readable analysis.
  3. Microsoft Copilot: An enterprise-level AI copilot for extracting insights, generating summaries, and supporting document-heavy work. This is general purpose in nature.
  4. GitHub Copilot: Suggests code, improves readability, reduces boilerplate, and assists developers and SDETs during automation script creation.

Level 3: Semi-Autonomous Agents

  1. Tricentis Testim Intelligent Assist: Learns UI behavior, maintains locator stability, auto-heals selectors, and builds test flows based on observed patterns.
  2. Tricentis Neoload MCP: Supports performance testing through AI-assisted workload modeling, result analysis, and bottleneck detection.
  3. n8n with LLM Nodes: Enables multi-step automation powered by reasoning. You build workflows, and LLMs decide how to handle decision points or data interpretation.
  4. Playwright with Agent Runtime: Connects LLM reasoning to actual browser control, allowing multi-step UI actions that adjust dynamically to page behavior.

Level 4: Autonomous Agents

  1. Tricentis Vision AI: Executes UI tests using visual understanding rather than selectors. Capable of healing, adapting, and navigating UI changes independently.
  2. Tricentis Testim Autonomous: Runs adaptive UI tests that retry, heal, and self-correct during execution reducing manual maintenance significantly.
  3. OpenAI Agents SDK: Provides infrastructure for building fully autonomous, tool-enabled agents that reason, plan, act, and build custom workflows with deep integration.

Conclusion

Agents don’t replace testers. They remove the boring and repeated work that keeps testers from doing the important work that actually matters. Once agents are created and deployed well, testers can focus on important activities such as risk analysis, product exploration, understanding, system modeling, and communication. Choosing the right type of agent for the right task is important, though. Agents can help you gain time without losing your judgment. It can also expand your reach. 

AI agents are not just a technology disruption. For the testing world, they are the much-needed accelerant to multiply the expert human capability. The magic lies in using them with intention and control.

If you have any questions or doubts, the Shiftsync platform encourages you to start a discussion by creating a new post or reaching out to the author.


Check out Part 1 here

7 replies

Bharat2609
Forum|alt.badge.img+3
  • Ensign
  • January 20, 2026

@parwalrahul  Solid framework. Very usable for testers who want to experiment without breaking things.


Ankur
Forum|alt.badge.img
  • Ensign
  • January 20, 2026

@parwalrahul :

 

This article is very well explained the Difference between AI and Agentic AI.

 

Also explained step by step guidance of User of Agentic AI. For every steps you mentioned with example that how any one can use Agentic AI in their day to day testing tasks.

 

I also ask my entire team to read this article and need to start with understanding the Agentic AI world.


parwalrahul
Forum|alt.badge.img+4
  • Author
  • Chief Specialist
  • January 20, 2026

Thank you for the feedback, ​@Ankur.

We are also doing a practical and hands on-oriented community webinar that will cover this topic.

You can ask your team to also register for it:

AI VS Human: Find AI flaws, gaps, and quality debts together, Thu, 12 Feb. 2026 at 14:30, Europe/Berlin | ShiftSync Community


ujjwal.kumar.singh
Forum|alt.badge.img+1

When something ends well, everything feels right and the conclusion in Part 2 feels the same. The quality of the content remains consistent across both parts, and the differences between AI and Agentic AI are explained clearly, starting from the basics. This was just as good as the first part.


parwalrahul
Forum|alt.badge.img+4
  • Author
  • Chief Specialist
  • January 20, 2026

@parwalrahul  Solid framework. Very usable for testers who want to experiment without breaking things.

thank you for the feedback, Bharat!

We will be going into the practical aspects of AI vs Humans in the upcoming webinar. Register for it here:

AI VS Human: Find AI flaws, gaps, and quality debts together, Thu, 12 Feb. 2026 at 14:30, Europe/Berlin | ShiftSync Community


parwalrahul
Forum|alt.badge.img+4
  • Author
  • Chief Specialist
  • January 20, 2026

When something ends well, everything feels right and the conclusion in Part 2 feels the same. The quality of the content remains consistent across both parts, and the differences between AI and Agentic AI are explained clearly, starting from the basics. This was just as good as the first part.


Thanks for the feedback, buddy!


Bharat2609
Forum|alt.badge.img+3
  • Ensign
  • January 22, 2026

@parwalrahul  Solid framework. Very usable for testers who want to experiment without breaking things.

thank you for the feedback, Bharat!

We will be going into the practical aspects of AI vs Humans in the upcoming webinar. Register for it here:

AI VS Human: Find AI flaws, gaps, and quality debts together, Thu, 12 Feb. 2026 at 14:30, Europe/Berlin | ShiftSync Community

Registered and excited to learn more