I remember perfectly the day when everything went wrong right after the release. It was ten in the morning, we had just sent out a big marketing email, and the signup page of our site suddenly froze. I remember diving straight into the logs to look for errors and found nothing. All the tests were green, while real users were seeing nothing but an endless loading spinner on their screens. At that moment I realized: as long as I don’t test the product the way a person uses it, I’m not really testing, I’m just hoping for luck.
This material is my experience, the very thing I badly needed back then. No complicated theory or obscure terms. Only specific steps that truly help, a bit of real code, and the mistakes I made then and never want to repeat.
An E2E test pretends to be a human:
That’s it. Simple idea, huge impact.
Imagine a simple to-do web app. A user should be able to add a task and see it in the list.
// e2e/add-task.test.js
import { test, expect } from '@playwright/test';
test('User adds a task', async ({ page }) => {
await page.goto('<https://todo.example>');
await page.fill('[data-test=new-item]', 'Buy milk');
await page.press('[data-test=new-item]', 'Enter');
await expect(page.locator('[data-test=item-text]:last-child'))
.toHaveText('Buy milk');
});
This matters because if any part of the stack like API, database, front-end—breaks, that last expectation fails. One red X beats a thousand green unit tests.
You can’t test every pixel on day one. Start with flows that cost money or reputation when they break.
In my teams we always ask one question: “If this fails in production, who loses sleep?” Anything that keeps someone up at night goes on the list:
Write those down, nothing else. You’ll expand later.
Good tests need predictable ground to stand on. Here’s what works for me.
Spin up a dedicated database seeded with known records before each run.
# scripts/seed.sh
psql "$TEST_DB" < schema.sql
psql "$TEST_DB" < initial_data.sql
Now item task_123
always exists, always says “Buy milk,” no surprises.
If your app fetches weather, stock prices, or maps, swap those endpoints for a stub that returns fixed data. Tests run faster and never break because the real service is down.
Tip: Point the stub URL with an environment variable so production stays untouched.
Live check before you ship
Keep stubs for every push, but run a nightly smoke job on the staging env that flips
USE_STUBS=false
and calls the real APIs with sandbox keys.
One green run there is your final safety net.
# .github/workflows/staging-smoke.yml
jobs:
smoke:
runs-on: ubuntu-latest
env:
USE_STUBS: "false"
steps:
- uses: actions/checkout
- run: npm ci
- run: npx playwright test --config=e2e.smoke.ts
Attribute selectors like data-test="checkout-button"
survive redesigns. CSS classes change often; data-test
rarely does. Add them in the codebase once and future-you will thank present-you.
Code tells computers what to do; clear naming tells humans why it matters.
await page.fill('[data-test=origin]', 'Paris'); // Where the flight starts
await page.fill('[data-test=destination]', 'Lisbon'); // Where the flight ends
await page.click('[data-test=search]'); // Get available flights
After six months, you’ll still understand what each line means. Future teammates, too.
Engineers fear E2E because they think it slows CI. Here’s how I keep runs short:
sleep(3000)
. Wait for a specific element or event.npx playwright test --workers=4
Even a medium suite drops from ten minutes to three on a laptop.
Automation that isn’t in CI/CD is a hobby. Add one extra step and treat a red test as a red build.
# .github/workflows/ci.yml
jobs:
e2e:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout
- run: npm ci
- run: npm run start:test &
- run: npx playwright test
Note: run the app in test mode, not production. Environment variables keep things separate.
Manual scripts can rot. Modern testing tools watch real user sessions and suggest flows to protect. I’ve used one on an e-commerce site: it noticed most people finished checkout with an express wallet and proposed that path as a test. We clicked “approve,” it generated the code—saved us hours.
What I like:
AI doesn’t replace thinking, but it cuts the grunt work.
How many top flows are under test? Aim for 100% of the list you made earlier.
Minutes from commit to test result. Lower is better. Under ten keeps momentum high.
Track bugs that reach users. Fewer incidents = E2E pays off.
If tests cry wolf too often, devs ignore them. Keep the noise low.
Collect numbers in your usual dashboard. A line that slopes up or down means more than a gut feeling in a sprint retro.
wait(5000)
hides slow code instead of fixing it. Wait for elements, not time.data-test
attributes early and changes in layout won’t break the suite.Nine times out of ten this flow finds the cause within five minutes.
End-to-end tests aren’t exotic; they’re just the product taking a self-guided tour each night. Start small:
Do that, and the next release will feel less like roulette and more like science. Want to push even faster? Let an AI tool suggest new flows and update selectors for you. You’ll spend evenings building features instead of hunting flaky tests—and maybe even sleep through that marketing email blast. Get started with QA.tech for free!