Edward Needham

EN ES

En Punto

Switching to Ephemeral Preview Environments

2025-12-30

The last blog post was left on a bit of a cliffhanger: even though we had solved the tooling issues (Playwright, Mock Service Worker), we could already see problems with scale and our testing infrastructure.

If you want to read the previous post in this series, click here.

TL;DR

We moved manual testing before merge by spinning up a per-PR preview environment (app + database branch), instead of queueing everything behind a long-lived staging environment. It removes the deployment bottleneck, but it comes with new maintenance (especially around seed data).

What broke at scale

As a reminder, we made the tradeoff to dockerise the end-to-end (E2E) testing infrastructure so the environment behaves like a mirror of production, allowing E2E tests to run as if they were a real user. There were issues with our approach; it was a tradeoff after all.

Previously, every time we created a new PR we triggered unit and integration tests. We could also trigger E2E tests if we wanted. The app would deploy and it could be manually tested too.

In that solution, manual testing came last in our workflow, after the code had already been merged.

This creates several problems. First, tests don’t necessarily capture the intended behaviour—only what the person writing them thinks the behaviour should be. Second, we had to wait for tests to run and pass before deploying, and in our flow merging was a prerequisite for deployment. Then we had to wait for manual testing, which could fail. During that time, development waits.

So we had two issues: workflow order and a long-lived staging environment. If all deployments go to the same place, there’s going to be a queue.

The change: ephemeral previews

Ephemeral environments have been our solution, and it solves both issues:

You can switch between the two workflows to see what has changed.

Pull Request Opened

opened, synchronize, reopened

▼

Fail Fast

Backend Tests

Go unit & integration tests

Frontend Tests

Vitest unit & component tests

▼

Deploy

Deploy Preview

Neon DB + Fly.io + Vercel

▼

Validate

E2E Tests

Playwright against preview

QA Test

Manual testing on preview

▼

MergeReady to Merge

Here’s the gist:

Before: PR → tests → merge → deploy → manual testing (too late) → fix forward
After: PR → tests → deploy preview → manual testing (before merge) → merge → cleanup

We now deploy earlier (Step 2 in the diagram), not last. We still run unit and integration tests, we just have a deployment that we can manually test before we merge the code. We remove the long-lived staging environment. Code is then merged into main if everything works as the preview infrastructure is the same shape as production.

The frontend is deployed on Vercel which automatically creates independent previews on every deploy. The api is deployed on Fly.io so we create a new app for each new PR within the workflow. While the workflow also integrates with a Neon github app to create a new database branch on every PR.

When a PR is merged into production, we have a cleanup workflow that removes the preview environments from Fly.io and deletes the preview database branch on Neon.

That means a PR results in a preview frontend, a preview API app, and a dedicated preview database branch, so changes can be validated in isolation. The preview sticks around until the PR is merged (or closed), at which point it’s torn down.

Because every change now has its own deployment, code can continue to be developed rather than waiting for E2E or manual tests to pass behind a staging queue.

We have also had to adapt our seeder because we don't want to be running our full seeder every time we create a new PR to populate the preview database branch. Instead, we run a very small seeder, just enough for the manual tests relative to the code changes. This is also a benefit because we decrease the compute and storage needed in Neon, reducing costs.

Is this new workflow perfect? No. Seed data is now part of the cost: we need to keep the lightweight seeder aligned with what we expect to see in the app for each change. In return we’ve removed the deployment bottleneck, every PR gets its own preview, so we can validate behaviour before merging to production. Software engineering is a game of tradeoffs.