Skip to main content

Challenge: What are the three biggest challenges for quality in the AI era?

  • March 17, 2026
  • 37 replies
  • 388 views

PolinaKr
Forum|alt.badge.img+6

Take on Filip’s challenge for a chance to win! The lucky winner will walk away with a gift box from us!🎁

What are the three biggest challenges for quality in the AI era?

Drop your answers in the comments below. 

37 replies

  • Apprentice
  • March 17, 2026
  1. Non-determinism — AI outputs vary across runs, making traditional assertion-based testing brittle. "Correct" is often subjective or probabilistic, requiring statistical validation and LLM-as-judge approaches.

  2. Observability & explainability — When a model produces a wrong answer, tracing why is hard. There's no stack trace for a bad inference, making root cause analysis and regression prevention fundamentally different from classical software.

  3. Data & prompt drift — Models degrade silently as real-world data distributions shift or prompts change. Continuous quality monitoring (evals, golden datasets, shadow testing) must replace one-time release testing.


Agentic AI usage in Quality


  1. Understanding and trusting AI outputs, 2. ensuring high‑quality data as the foundation for quality, and 3. maintaining continuous validation as models evolve - all of which are significantly more complex than traditional deterministic software quality checks.

  • Ensign
  • March 17, 2026
  • Effective usage of AI in reducing automation test maintenance
  • Use of agents in e2e test scenarios automation covering (UI, API and Backend CLI)
  • AI workflows for e2e test cycle, starting from test case generation from PRDs to automating test cases to execution, to find root cause of failures and logging defect

 

1.Data Quality & Bias

AI systems are only as good as the data they learn from.

 

Why it’s a challenge:

 

  • Incomplete, noisy, or outdated data → poor predictions
  • Hidden bias in training data → unfair or discriminatory outcomes
  • Data drift over time → models become less accurate

 

 

Example:

A hiring AI trained on historical data may unintentionally favor certain profiles if past hiring was biased.

 

 

2.Explainability & Transparency (Black Box Problem)

 

 

Many AI models (especially deep learning) don’t clearly explain why they made a decision.

 

Why it’s a challenge:

 

  • Hard to debug failures
  • Difficult to gain user trust
  • Regulatory pressure (especially in finance, healthcare)

 

3.

Dynamic Behavior & Continuous Learning

 

 

Unlike traditional software, AI systems evolve over time.

 

Why it’s a challenge:

 

  • Models can change with new data (retraining)
  • Same input may produce different outputs over time
  • Hard to define “expected results” (no fixed oracle)

 

 

Example:

A recommendation engine today may behave differently next week after retraining.

 


vbank
  • Ensign
  • March 17, 2026
  1. Data Quality & Bias - AI systems are only as good as the data they’re trained on. Poor, incomplete, or biased datasets lead to unreliable outputs.!--endfragment>!--startfragment>
  2. Transparency & Explainability - Many AI models (especially deep learning) are “black boxes.” It’s difficult to explain why a model made a certain decision.!--endfragment>!--startfragment>
  3. Governance, Ethics & Security - AI introduces new risks—ethical misuse, regulatory non-compliance, and adversarial attacks.!--endfragment>!--startfragment>

  • Space Cadet
  • March 17, 2026

1️⃣ AI systems built on Machine Learning aren’t fully predictable, so testing shifts from checking exact results to evaluating reliability and behavior patterns.
2️⃣ Data quality and model drift can silently reduce performance over time, making continuous monitoring essential.
3️⃣ Systems powered by Generative AI introduce new risks around trust, safety, and explainability that QA must actively manage.


Dhrumil812
  • Ensign
  • March 17, 2026

The three biggest challenges for quality in the AI era are:


1. Believable mistakes AI doesn’t fail loudly – it fails convincingly. It can give answers that sound right but aren’t, making it harder to detect issues and easier to trust the wrong output.

2. Constantly changing behavior Unlike traditional systems, AI isn’t stable. Small changes in data, prompts, or models can shift results, so quality isn’t a one-time check – it's something that needs continuous monitoring.

3. Blurred accountability When AI goes wrong, it’s unclear who’s responsible – the model, the data, or the developer. This makes ensuring fairness, reliability, and trust much more complex.

 

Quality in AI is no longer just about correctness—it’s about trust, adaptability, and responsibility.


  • Ensign
  • March 17, 2026
  1. Data Quality - AI models depend heavily on data. Ensuring clean, balanced, and well labeled data is a major challenge
  2.  Black Box problem - Many AI Models, especially deep learning, are difficult to interpret. This makes it hard to understand decisions, debug, issues, and build user trust
  3.  Testing complexity - Quality now includes privacy and safety. AI systems avoid this
  4. Security Risks -AI systems are vulnerable to adversarial attacks, data and misuse
  5. Scalability and performance - It require high computational resources and must perform efficiently at scale.
  6. Quality in the AI era is continuous, multi dimensional process that goes beyond testing. It requires managing data, monitoring models

  • Space Cadet
  • March 17, 2026
  1. Hallucinations and Accuracy: AI models often generate information that is factually incorrect, nonsensical, or completely fabricated, making rigorous verification essential.

  2. Bias and Fairness: Models can inherit and amplify biases present in their training data, leading to unfair or discriminatory outcomes that must be actively identified and mitigated.

  3. Lack of Explainability (The "Black Box"): The decision making process of complex AI is often opaque, making it difficult to understand why an error occurred or to trust the output in high stakes scenarios.


  • Space Cadet
  • March 17, 2026

1. We're moving fast, but thinking slow
Yes, AI speeds up our work. But, AI also turns off our brains. That is not good while testing.
2. More tests, less confidence
Yes, AI creates 100 tests in seconds. But, is this the correct set of 100 tests?
3. Nobody owns the bug anymore
Yes, AI created the bug, AI tested it, AI passed it. But, if it fails, then who is responsible?…...


  • Ensign
  • March 17, 2026

From a practical QA perspective, three challenges I see are:

1.AI can create a false sense of coverage by generating many tests without ensuring meaningful validation.

2. It can amplify flakiness in already unstable systems, making failures harder to debug.

3. And over-reliance on AI risks reducing deep product understanding, which is critical for identifying real quality gaps


  • Space Cadet
  • March 17, 2026

In the AI era, quality is no longer just about validating functionality — it’s about ensuring trust in systems that are inherently unpredictable.

First, AI systems are non-deterministic, so the same input can produce different outputs. This makes traditional “expected vs actual” validation insufficient, requiring us to define acceptable response boundaries instead.

Second, there’s the challenge of correctness vs plausibility — AI can generate outputs that sound convincing but are factually wrong, making validation much harder than in deterministic systems.

Third, the focus is shifting from just testing outputs to controlling behavior. Techniques like context engineering and structured system design are becoming essential to reduce unpredictability and ensure consistent, reliable results.

Ultimately, the biggest challenge is moving from testing features to building systems that are trustworthy at scale.


  • Space Cadet
  • March 17, 2026
  1. (Re-)defining the quality in the era of AI Engineering
  2. Keeping up with the development of AI tools and effectively selecting/integrating useful ones
  3. Volume + velocity

  • Space Cadet
  • March 17, 2026

3 biggest challenges for quality in AI era can be following:

1.> Data Integrity:  What kind of data is fed to models and how much they are trained, it becomes difficult to verify and validate what’s given to the model.

2.> Biased Output: Output can’t be trusted because machine has produced result on the basis of what’s fed to it.So, it’s become difficult to validate output.

3.> Accountability & Transparency: Quality comes into a questionable state because, you need to work on a large data at once and if something fails it becomes difficult to pinpoint single use case and accountability and transparency comes into questionable form.


  1. Data quality
  2. Bias & Transparency 
  3. Regulations & compliance

  • Space Cadet
  • March 17, 2026
  • Speed over substance:
    AI makes it easy to produce fast, but quality suffers when we skip thinking, refining, and depth.
  • Sameness over originality:
    AI outputs tend to sound alike, making it harder to stand out with authentic, human ideas.
  • Confidence over truth:
    AI can sound right even when it’s wrong, so maintaining accuracy and trust becomes harder.

  • Space Cadet
  • March 17, 2026

In the AI era, quality is no longer just about validating functionality — it’s about ensuring trust in systems that are inherently unpredictable.

First, AI systems are non-deterministic, so the same input can produce different outputs. This makes traditional “expected vs actual” validation insufficient, requiring us to define acceptable response boundaries instead.

Second, there’s the challenge of correctness vs plausibility — AI can generate outputs that sound convincing but are factually wrong, making validation much harder than in deterministic systems.

Third, the focus is shifting from just testing outputs to controlling behavior. Techniques like context engineering and structured system design are becoming essential to reduce unpredictability and ensure consistent, reliable results.

Ultimately, the biggest challenge is moving from testing features to building systems that are trustworthy at scale.

You need someone who can test intelligence, behavior and unpredictability.


Take on Filip’s challenge for a chance to win! The lucky winner will walk away with a gift box from us!🎁

What are the three biggest challenges for quality in the AI era?

Drop your answers in the comments below. 



Practically, IMHO
- AI systems don’t produce consistent outputs for the same input, which makes traditional assertion-based testing ineffective. I approach this by focusing on intent-based validation and semantic checks instead of exact matches.

- There’s often no single correct answer in AI systems, making it difficult to define pass/fail criteria. I address this by using multi-dimensional evaluation looking at correctness, relevance, and factual accuracy along with human-in-the-loop validation where needed.

-System performance is heavily dependent on data, and it can degrade over time as data changes. I treat data as a first-class test artifact by versioning datasets, prompts, and models, and by continuously monitoring for drift and quality drops.


  • Space Cadet
  • March 17, 2026

Three Challenges i see

  1. Every Agent solutioning should be evaluated for Hallusinations, Toxicity, Bias and other such quality metrix. AI infused testing is a skill in itself. 
  2. The response may differ for different models same set of steps as the reasoning ability varies from model to model. 
  3. AI solutioning generally works well for prototyping and POC but need to be mindful while building the prod grade application where the system constaints are also to be considered. 

  • Apprentice
  • March 17, 2026

Three Biggest Quality Challenges in the AI Era
 

1. Unpredictable Outputs
AI can give different answers for the same input, making testing and debugging difficult.

2. Hard to Measure Quality
There’s no single “correct” answer—quality depends on accuracy, relevance, and context.

3. Trust & Safety Issues
AI can be wrong, biased, or unsafe, so ensuring reliable and ethical behavior is a big challenge.


1) Data Quality & Bias 

  • incomplete data leads to unreliable outputs

  • Biased datasets amplify unfair or discriminatory results

  • Outdated data produces irrelevant or incorrect insights

2) Consistency & Reliability at Scale

  • Same prompt slightly different outputs

  • Edge cases can produce errors or hallucinations

  • Performance may degrade over time

3) Explainability

  • AI's black-box nature makes it hard to understand decision-making processes, eroding trust and complicating quality assurance.

  • Space Cadet
  • March 17, 2026

3 biggest Quality Challenges in AI Era

1. AI hallucination and drifts

It gets frustrating at times to get the model respond to the prompts and get the expected answer.

2. Training for Quality Engineers on Context Engineers

Proper training needed to Quality Engineers on good practices on prompting and building skills.

Absence of this, can bloat context,  increase token usage and increase costs 

3. Response is nondeterministic . I find it difficult to telling a prompt is accurate, what is the precision,what metrics can be used to assessa prompt.

 

 

 

 


  • Space Cadet
  • March 17, 2026
  1. How precisely we are able to write the rules in .md files?
  2. If agent makes a mistake, we need to fix the rules wherever applicable and find time to train the agent.
  3. Need to prioritize tests based on risks (business, customer impact, quality/privacy standards etc)

  • Space Cadet
  • March 17, 2026

The three biggest challenges for quality in the AI era are ensuring fairness and eliminating bias, maintaining transparency and explainability, and safeguarding compliance and ethical standards. These issues directly affect trust, adoption, and the long-term sustainability of AI systems.