The Survival Guide for Software Testers
- Apr 25
- 5 min read

Will AI kill QA?
Prophecies about the “death of QA” are heard every time a new technology enters the market, but the Generative AI revolution of 2024-2025 is different from its predecessors. Tools like GitHub Copilot and ChatGPT allow developers to produce code at unprecedented speed, and sometimes even write the unit tests themselves.
Over the past decade, the role of QA has shifted from a focus on manual testing to a focus on automation, CI/CD, and shift-left. However, until recently, even in mature organizations, a significant portion of QA work still relied on manual human labor – writing scenarios, selecting test data, maintaining tests against changes, and investigating failures. The advent of LLMs and GenAI tools changes the equation, as it allows for “converting knowledge into language” (requirements, scripts, logs, documentation) into test products and code at a higher rate.
On the surface, it seems that the traditional role of the software tester – writing test scenarios and running them – is in danger of extinction. The QA Engineer is not only required to “write tests faster”, but to become a quality engineer who manages a socio-technical system in which some of the tests and code are generated by models, and therefore control mechanisms are required. As AI accelerates code writing (“Velocity”), it also creates “Technical Debt” of code that has not been thoroughly reviewed, and can introduce new types of failures into the system that we have not yet encountered.
The focus is in an evolving transformation process and is moving from a concept of Testing WITH AI (using tools to streamline work) to Testing OF AI.
From Deterministic Oracle to Probabilistic Chaos
For decades, software testing was based on the principle of the “Deterministic Oracle”.
Then came LLMs and GenAI, and suddenly, the world of QA is undergoing a paradigm shift: our oracle is no longer always deterministic. The same input can give different results and still be “correct”. Welcome to probabilistic chaos. Models are by their very nature probabilistic, not deterministic. The same prompt can yield answers that vary widely in length and sometimes in content. This shift requires QA professionals to abandon the binary Pass/Fail approach and adopt a Risk & Confidence Score approach.
The three components of the new role
Tasks that are accelerated
These are tasks where AI improves efficiency, but does not necessarily change the purpose of the work
Creating test scenarios and test plans from text (requirements, user stories, documentation).
Generating test code (Unit/Integration/E2E) from existing code or description. A common industrial example is using Copilot Chat to create unit tests.
Failure analysis: Summarizing logs, grouping flaky failures, suggesting root cause directions.
Maintenance: Refactoring, self-healing or local fixing of tests that broke due to UI/DOM changes.
Extended Responsibility
As test production becomes easy and fast, the bottleneck shifts to the ability to plan for quality and control AI products
Moving from quantitative metrics to risk metrics – in a situation where hundreds of tests can be produced “on paper”, the central question becomes Risk-Based Quality – what is the business/regulatory/operational risk of a component, and what is a test strategy that maximizes the detection of high-risk failures.
Oracle and Assumption Control – Oracles themselves may be “designed to pass”, and therefore not reveal bugs. Hence, QA must develop auditing capabilities – identifying weak assertions, tests that do not represent real use, or missing end cases.
Quality of Inputs (Requirements/Prompts/Context) – The quality of the product depends largely on the quality of the input: user story formulation, acceptance criteria, example data and prompts. Therefore, QA takes on broader responsibility in leading the standardization of test intent and acceptance documents.
New Responsibilities
This is the layer where QA becomes a key player in governance and risk control
Security in LLM systems – The main risks are – Prompt Injection and Insecure Output Handling. In practice, this creates a new set of tests – injection tests, output validation tests, permission tests around AI tools, and “context poisoning” tests.
Risk Management and Governance – A human-in-the-loop is required at critical points for verification and control: traceability, prompt versioning, and rules for using sensitive data.
The functional meaning for QA is a shift from “row-level” control to “system-level” control:
Defining goals: What is the purpose of the coverage (critical flows), what are the assumptions, and what risks must be tested.
Defining constraints: Which environments the agent is allowed/not allowed to access, which data is not allowed to be used, and which actions require human approval.
Product review: code review of created tests, with an emphasis on oracles, stability, and prevention of flaky tests.
Investigate “new” failures: not just product failures, but failures of the autonomous infrastructure (e.g., incorrect repair of a healer).
Here it becomes clear that QA has not “disappeared” – it has been transformed into a quality manager of a hybrid human-agent team.
The transition to “quality architect” is based on three key changes in daily work:
The engineer as architect
Until recently, “automation” was synonymous with writing Java/Python code inside frameworks like Selenium Playwright. Today, AI tools have the ability to generate these scripts in seconds.
The value of the new engineer is no longer measured by technical coding ability ("How to test"), but by strategic ability ("What to test"). The role now requires holistic test coverage planning, smart synthetic data management, and checking that the automation code written by the AI actually covers the business logic, not just the UI.
The Challenge of Non-Determinism (Semantic Testing)
How do you test a chatbot that doesn't have a single correct answer? In this case, new techniques come into play:
Semantic Similarity: Using algorithms (such as Cosine Similarity) to check the content of the answer – whether the content/meaning of the answer is similar to the desired result.
LLM-as-a-Judge: Using one AI model to review and score the answers of another model.
Property-Based Testing: Defining rules/rules that the answer must meet (“the answer must not contain swear words”, “the answer must be shorter than 100 words”) and testing them over thousands of iterations.
The Ethical Guardian (AI Ethics & Safety)
This is perhaps the most important role. QA becomes the last line of defense before exposing the organization to reputational or legal damage.
New AI testers are engaging in Red Teaming: active and sophisticated attempts to “break” the model, causing it to hallucinate, leak sensitive information, or exhibit racial/gender bias. This is an area that requires human critical thinking that no AI can replace, as of today.
New skills matrix
In light of the above analysis, QA engineers must perform reskilling. The following table summarizes the required transition:
Traditional skill | Required skill |
Automation Scripting | Code Review & Refactoring |
SQL / Test Data Creation | Prompt Engineering |
Functional Testing | AI Safety & Ethics |
Bug Reporting | Root Cause Analysis |
Siloed Work | Data Literacy |
Summary: From Gatekeepers to Guides
Artificial intelligence does not eliminate the need for testing; it intensifies it. As systems and developments make more extensive use of AI tools, the level of autonomy and complexity increases, and so does the risk of "silent" but devastating errors for the organization.
The QA engineers of the near future will not be those who sit in the corner and run tests. They will be architects of quality, who will understand the technology in depth, know how to harness AI to their advantage, but most importantly, know when not to trust it. Those who are able to make this transition will find that their role has become central, influential, and of critical significance to the organization.



Comments