How to E2E Test Sign-Up Flows With Real Emails (Without the Pain)

Stop mocking email in your E2E tests. Use real inboxes, real OTP delivery, and real waiters — without polluting your team inbox.

May 1, 2026 by catchotp team

End-to-end tests exist to catch the integration bugs that unit tests miss. The moment you mock email in those tests, you give up the integration coverage that justified writing them in the first place. The DKIM record, the From header, the transactional provider’s rate limit, the magic-link URL — none of it is exercised. The test passes; production does not.

This is a practical guide to E2E-testing sign-up flows with real emails: why mocking falls over, what “real-email” testing actually means, full Playwright and Cypress walkthroughs, the CI considerations that catch teams off guard, and the patterns that scale to a real-sized test suite.

Why mocking email fails

When you mock the email provider in an E2E test, the assertion you are making is “given that an email was sent, our app handles the OTP correctly.” That is a useful assertion. It is also one a unit test can make, which means you have moved an integration cost into your E2E suite without buying any integration coverage.

The bugs you do not catch when you mock:

Wrong DKIM signing. The transactional provider rotates a key, your DKIM record is stale, the email lands in spam in production. Mock-based tests pass.
Wrong From header in a new environment. Staging signs as staging-noreply@, production signs as noreply@. The latter is on a stricter SPF policy. Mock-based tests pass.
Magic-link URL points at staging. A misconfigured environment variable in the email template ships a magic link that 404s in production. Mock-based tests pass.
Transactional provider rate-limited. A burst of tests trips a per-second limit your provider quietly enforces. Mock-based tests pass.

Every one of these is a real outage we have seen. None of them shows up without a real receive-side.

The shared-inbox alternative is not the answer either

The instinct after deciding to use real email is to point the signup at a shared QA mailbox: qa@yourcompany.com, monitored by a script. This solves the integration problem and creates four new ones.

Parallel tests step on each other. Two tests run at once, both wait for “the latest email,” and one of them gets the wrong code.
The inbox accumulates years of junk. Six months in, finding the right email by subject regex is unreliable.
Anyone with the password can read anyone’s verification codes. Including, occasionally, real customer signups that get misdirected.
Cleanup is its own ten-line script. Now you maintain that script.

What you actually want is one inbox per test, on a real domain, with a waiter that resolves the moment the email lands. That is the entire shape of a programmable-email API.

What “real-email testing” means

Three properties define it:

Real internet-facing domain. The email crosses real DNS, real MX, real TLS, real spam filters. Self-hosted SMTP catchers like Mailpit are great for unit tests but they do not exercise the production path.
Per-test inbox isolation. Every test gets its own address. Parallel tests do not contend. Cleanup is automatic via TTL.
Long-poll waiters. No sleep(N), no polling loop. The test blocks on a single HTTP request that resolves the moment the matching message arrives.

When you have all three, the test code shrinks dramatically — usually a fixture and a waitForOtp call.

Playwright: a complete walkthrough

The pattern is a Playwright fixture that yields an inbox per test.

Step 1: define the fixture

// tests/fixtures/inbox.ts
import { test as base } from '@playwright/test';
import { CatchOTP, type Inbox } from '@catchotp/sdk';

export const test = base.extend<{ inbox: Inbox; otp: CatchOTP }>({
  otp: async ({}, use) => {
    const otp = new CatchOTP({ apiKey: process.env.CATCHOTP_KEY! });
    await use(otp);
  },
  inbox: async ({ otp }, use) => {
    const inbox = await otp.inboxes.create({ mode: 'ephemeral', ttlMinutes: 15 });
    await use(inbox);
    await otp.inboxes.delete(inbox.id).catch(() => {});
  },
});
export { expect } from '@playwright/test';

Step 2: write the OTP test

// tests/signup-otp.spec.ts
import { test, expect } from './fixtures/inbox';

test('email OTP signup', async ({ page, inbox, otp }) => {
  await page.goto('/signup');
  await page.getByLabel('Email').fill(inbox.address);
  await page.getByRole('button', { name: 'Continue' }).click();

  const code = await otp.inboxes.waitForOtp(inbox.id, { timeoutSeconds: 30 });

  await page.getByLabel('Verification code').fill(code);
  await page.getByRole('button', { name: 'Verify' }).click();
  await expect(page.getByRole('heading', { name: 'Welcome' })).toBeVisible();
});

Step 3: write the magic-link test

// tests/signup-magic-link.spec.ts
import { test, expect } from './fixtures/inbox';

test('magic-link signup', async ({ page, inbox, otp }) => {
  await page.goto('/signin');
  await page.getByLabel('Email').fill(inbox.address);
  await page.getByRole('button', { name: 'Send magic link' }).click();
  await expect(page.getByText('Check your inbox')).toBeVisible();

  const message = await otp.messages.waitFor(inbox.id, {
    subject: /sign in/i,
    timeoutSeconds: 30,
  });
  const link = message.links.find((l) => l.url.includes('/auth/magic'));
  expect(link).toBeDefined();

  await page.goto(link!.url);
  await expect(page.getByRole('heading', { name: 'Welcome back' })).toBeVisible();
});

Step 4: write the password-reset test

// tests/password-reset.spec.ts
import { test, expect } from './fixtures/inbox';

test('password reset', async ({ page, inbox, otp }) => {
  // Pre-existing user with this email; create via API, not UI
  await fetch(`${process.env.API_URL}/test/users`, {
    method: 'POST',
    body: JSON.stringify({ email: inbox.address, password: 'old-pwd' }),
  });

  await page.goto('/forgot-password');
  await page.getByLabel('Email').fill(inbox.address);
  await page.getByRole('button', { name: 'Send reset link' }).click();

  const message = await otp.messages.waitFor(inbox.id, {
    subject: /reset/i,
    timeoutSeconds: 30,
  });
  const resetUrl = message.links.find((l) => l.url.includes('/reset'))!.url;

  await page.goto(resetUrl);
  await page.getByLabel('New password').fill('new-pwd');
  await page.getByRole('button', { name: 'Set password' }).click();
  await expect(page.getByText('Password updated')).toBeVisible();
});

Three tests, one fixture, every one of them exercising the real email path.

Cypress: the equivalent setup

Cypress does the same shape with custom commands.

// cypress/support/commands.ts
import { CatchOTP } from '@catchotp/sdk';
const otp = new CatchOTP({ apiKey: Cypress.env('CATCHOTP_KEY') });

Cypress.Commands.add('createInbox', () => {
  return cy.wrap(otp.inboxes.create({ mode: 'ephemeral', ttlMinutes: 15 }));
});

Cypress.Commands.add('waitForOtp', (inboxId: string) => {
  return cy.wrap(otp.inboxes.waitForOtp(inboxId, { timeoutSeconds: 30 }));
});

Cypress.Commands.add('waitForLink', (inboxId: string, hostnamePart: string) => {
  return cy
    .wrap(otp.messages.waitFor(inboxId, { timeoutSeconds: 30 }))
    .then((msg) => msg.links.find((l) => l.url.includes(hostnamePart))!.url);
});

// cypress/e2e/signup.cy.ts
describe('signup', () => {
  it('completes via OTP', () => {
    cy.createInbox().then((inbox) => {
      cy.visit('/signup');
      cy.get('input[name="email"]').type(inbox.address);
      cy.contains('Continue').click();

      cy.waitForOtp(inbox.id).then((code) => {
        cy.get('input[name="code"]').type(code);
        cy.contains('Verify').click();
        cy.contains('Welcome').should('be.visible');
      });
    });
  });
});

Same pattern, same primitives. Pick whichever runner you already use.

CI considerations that catch teams off guard

1. Concurrency limits

If you run twenty parallel test workers and each creates an inbox, you need an inbox cap of at least twenty. Free tiers usually cap at five. Pro at fifty. The cheap interim fix is to gate email-using tests with a Playwright project tag and serialize that project.

// playwright.config.ts
projects: [
  { name: 'unit', testMatch: /unit/ },
  { name: 'e2e-no-email', testMatch: /e2e-no-email/, fullyParallel: true },
  { name: 'e2e-email', testMatch: /signup|magic|reset/, fullyParallel: false, workers: 4 },
]

2. Timeouts

The default Playwright test timeout is 30 seconds. A test that waits 30 seconds for an OTP can blow the test timeout if the page navigation also takes a few seconds. Bump the test timeout for email-using tests:

test.setTimeout(60_000);

3. CI secrets

The API key goes in your CI provider’s secret store. Per-pipeline scoped keys are best practice: one key for staging E2E, a different key for nightly regression. A leak in one does not cascade.

4. Environment variables

The fixture reads CATCHOTP_KEY from process.env. In Cypress, that has to come through Cypress.env(), which means setting it in cypress.config.ts:

export default defineConfig({
  e2e: {
    env: {
      CATCHOTP_KEY: process.env.CATCHOTP_KEY,
    },
  },
});

5. Retries

If a test runner retries on failure (Playwright’s retries: 2, for example), each retry creates a fresh inbox. That is correct behavior — every retry should be hermetic — but it counts toward your message quota. Plan for it.

Patterns that scale

A few patterns we have seen work well at a hundred-test scale.

Per-suite shared inbox for state-heavy flows

For tests that exercise long-running state (a 24-hour expiry, a billing-cycle webhook), use a persistent inbox per suite rather than ephemeral per test. Create the inbox in globalSetup, reuse across tests, delete in globalTeardown.

Use subject filters aggressively

When a single inbox might receive multiple emails, filter by subject regex to make sure you wait for the right one:

const verifyEmail = await otp.messages.waitFor(inbox.id, {
  subject: /verify your email/i,
  timeoutSeconds: 30,
});

Webhooks for long-running suites

For QA suites that run for hours, webhooks can be more efficient than long-poll: the test scheduler suspends until the webhook fires. catchotp Pro and above support per-inbox webhooks signed with a shared secret.

Failure debugging

Attach the inbox URL to test failure output. When the test fails at three in the morning, the on-call engineer can open the catchotp UI and see exactly what mail arrived and what got parsed.

test.afterEach(async ({ inbox }, testInfo) => {
  if (testInfo.status === 'failed') {
    testInfo.attach('catchotp inbox', {
      body: `https://app.catchotp.com/inboxes/${inbox.id}`,
      contentType: 'text/plain',
    });
  }
});

Wire it up this afternoon. Free tier covers most CI pipelines. Start free or browse the E2E testing use case for more patterns.

How to Test OTP Flows in 2026
Programmable Email vs Disposable Email
The E2E testing use case and the QA automation use case cover framework-specific patterns.

The honest takeaway: real-email E2E testing is dramatically less effort than the workarounds you have been running, once you have the right primitive. A fixture and a waiter is the whole story. The hardest part is deleting the sleep(10).

How to E2E Test Sign-Up Flows With Real Emails (Without the Pain)

Why mocking email fails

The shared-inbox alternative is not the answer either

What “real-email testing” means

Playwright: a complete walkthrough

Step 1: define the fixture

Step 2: write the OTP test

Step 3: write the magic-link test

Step 4: write the password-reset test

Cypress: the equivalent setup

CI considerations that catch teams off guard

1. Concurrency limits

2. Timeouts

3. CI secrets

4. Environment variables

5. Retries

Patterns that scale

Per-suite shared inbox for state-heavy flows

Use subject filters aggressively

Webhooks for long-running suites

Failure debugging

Subscribe to the catchotp newsletter

Try it for yourself.

How to E2E Test Sign-Up Flows With Real Emails (Without the Pain)

Why mocking email fails

The shared-inbox alternative is not the answer either

What “real-email testing” means

Playwright: a complete walkthrough

Step 1: define the fixture

Step 2: write the OTP test

Step 3: write the magic-link test

Step 4: write the password-reset test

Cypress: the equivalent setup

CI considerations that catch teams off guard

1. Concurrency limits

2. Timeouts

3. CI secrets

4. Environment variables

5. Retries

Patterns that scale

Per-suite shared inbox for state-heavy flows

Use subject filters aggressively

Webhooks for long-running suites

Failure debugging

Related reading

Subscribe to the catchotp newsletter

Try it for yourself.