QA Strategy

Why software companies should only test what matters

July 3, 2025

Fred Stevens-Smith

Contents

While most engineering organizations have developed sophisticated processes for managing technical debt, there’s a parallel problem that receives far less attention but can be equally destructive: test debt. Unlike technical debt, which accumulates from the natural evolution of complex systems, test debt has a surprisingly simple root cause—we test too much stuff.

The symptoms are familiar to anyone who’s worked on a growing engineering team: automated test suites with consistent failures, flaky tests that pass and fail seemingly at random, and QA processes that teams no longer trust. Like technical debt, test debt demands continual interest payments through bugs in production, slower releases, and the erosion of confidence that should be the foundation of any reliable deployment process. The irony is that while teams struggle under the weight of hundreds or thousands of tests, the solution isn’t to write better tests—it’s to write fewer of them, and to focus ruthlessly on testing only what would actually block a release if it weren’t working.

Contents

The Root Causes of Overtesting

Tool-Driven Test Creation

The first reason we create too many tests lies in how developers work. It’s significantly easier to create new tests than to fix broken ones—just like it’s easier to write new code than debug existing code. When you’re in flow state, having just implemented a feature, you have all the context loaded in your head. You understand exactly what the code is trying to accomplish and how all the pieces fit together.

Most organizations approach test creation episodically. Every few weeks or months, there’s a push to “improve test coverage,” and developers create tests in batches. During these focused sessions, they’ll write far too many tests, including low-stakes tests for edge cases and minor features—the equivalent of testing every back alley when you should be focused on the main thoroughfares.

This creates an asymmetric cost problem. Creating a test while you’re in context is cheap and fast. But months later, when that test starts failing because the UI changed or a minor workflow was updated, fixing it becomes expensive. You need to understand what the test was supposed to do, figure out what changed, and determine whether the failure indicates a real problem or just an outdated test assumption.

Misaligned Incentives

The second driver of overtesting is an incentives problem within QA teams. If you have a dedicated QA team, their stated job is to prevent bugs from reaching production. But what’s the actual job the company needs them to do?

Unless you’re building software for space missions or financial infrastructure, your goal isn’t zero bugs—it’s the right amount of bugs. This is a nuanced perspective that’s difficult for QA teams to adopt when they’re held accountable for any bug that slips through. The natural response is defensive: test everything, because any untested scenario that breaks in production reflects poorly on the QA process.

Without thoughtful leadership and close partnership between engineering and QA leaders, teams will inevitably push for more comprehensive testing. It’s simply a result of how the incentives are structured.

The Trust Problem: When Tests Become Noise

The fundamental issue with overtesting is that it destroys trust in your test suite. Quality assurance is ultimately about emotion—it’s about feeling assured that your application works as intended. When your test suite regularly produces false positives, that assurance evaporates.

Here’s what typically happens: most test failures aren’t actually indicating broken functionality. In our experience, roughly 9 out of 10 test failures occur because the application changed in some way that broke the test, not because the underlying feature is broken. Maybe the sign-up flow got a new field, or a button moved, or the API response format changed slightly.

Teams quickly learn this pattern, and failing tests get dismissed with “Oh, that works fine, the test just needs updating. We’ll fix it later.” But “later” rarely comes, and complacency sets in. Tests become background noise rather than reliable signals.

This is dangerous. We’ve seen countless organizations experience major outages in features that were supposedly covered by automated tests. How is this possible? Earlier, those tests had become flaky and were being routinely ignored. The team had accepted that “tests break sometimes” and learned to work around unreliable signals. When the feature actually broke, the test failure looked identical to all the previous false alarms.

The Snowplow Principle: Test What Matters

Think about your application like a town after a snowfall. Which roads do you clear first? The main arterial routes that keep the town functioning. Next, you clear important side streets that connect to key locations. The tiny residential lanes might never get plowed at all—and that’s perfectly fine.

Your test suite should follow the same prioritization. Thoroughly test the main user flows that define your product’s core value. Test the critical paths that, if broken, would prevent you from shipping. But resist the urge to test every edge case and minor feature.

The key question to cut through the noise is simple: What would block a release if it weren’t working?

This might seem like too high a bar, but consider this: we’ve almost never encountered a team suffering from QA problems because they had too few tests. The pattern is overwhelmingly the opposite—companies drowning in too many tests, with test failures routinely ignored and test suites considered untrustworthy.

Building Reliable Test Signal

Your goal should be for your test suite to serve as the definitive source of truth about whether your application is ready to ship. If tests pass, you should feel confident releasing. If tests fail, you shouldn’t release until they’re fixed—even if you suspect the test itself is broken rather than the underlying functionality.

This only works if you maintain an incredibly high bar for what gets tested. Every test in your suite should be worth fixing before release. If you find yourself routinely skipping or ignoring certain test failures, those tests shouldn’t exist.

The complementary solution is choosing testing technology with lower maintenance costs. In the age of AI, look for tools that can automatically maintain tests as your application evolves, only failing when there’s truly a problem worth investigating.

The Path Forward

The solution to test debt isn’t more sophisticated testing strategies or better test maintenance processes—it’s disciplined restraint in what you choose to test. Focus on the main thoroughfares of your application. Test what matters. Ignore what doesn’t.

Your customers don’t care if every edge case is bulletproof. They care that your core product works reliably. Your engineering team doesn’t need perfect test coverage. They need trustworthy signals about when it’s safe to ship.

By testing less but testing better, you’ll build the assurance that quality processes are meant to provide: confidence that when your tests pass, your product is ready for the world.