Tabby E2E Test Results

Six priority invariants for the user-perceived quality bar. Each recording is the real test running under Xvfb + Playwright.

6 of 6 passing · recorded 2026-04-24

How to read this page

Each test targets one promise the app makes to the user — something we'd be upset to regress. Each section has:

The videos are silent, short (5–50 seconds), and unedited. They show the real Chromium window the test drives. The first part of each video is test setup (opening pages, capturing thumbnails); the actual invariant moment is usually in the last few seconds where Tabby’s grid is visible. For sub-100 ms invariants like no-blip, the relevant transition is too fast for the eye to see — the test output is the real proof.

01

no-blip

Pass 15.5 s

When I pinch-out on any page to open Tabby, is there ever a blank flash before the new tab shows content?

Show test output (the assertion-level proof)
Loading…

Scenario

  1. Open a page with a solid red background.
  2. Wait for Tabby to capture its thumbnail.
  3. Dispatch a ctrl+wheel pinch-out gesture.
  4. Tabby opens a new tab and runs its zoom-entrance animation.

What to watch for

The screen should stay visually filled throughout the transition. The red page's screenshot persists into the new Tabby tab as the pre-overlay, then springs down into its card — no white or dark flash in between.

Failure would look like

A ~50 ms flash of Tabby's background color (grey/dark) between the source page going away and the zoom-entrance starting. The test asserts every rAF tick from the tabbyReady handshake onward has a covering element.

02

never-flash

Pass 46.8 s

When I drag a tab card from one Chrome window into another inside Tabby, does the card list ever collapse to empty during the rebuild?

Show test output (the assertion-level proof)
Loading…

Scenario

  1. Open 4 coloured test pages across two Chrome windows.
  2. Open Tabby in grid layout — two window sections are visible.
  3. Drive a real pointer-driven drag (mousedown, jog past dnd-kit's 8 px activation, move to target, mouseup) from a card in window B onto a card in window A.

What to watch for

The card count holds steady through the drag — you can see the card lift, the window sections reflow, and the card settle into its new section. No blank grid at any point.

Failure would look like

A single frame where both window sections contain zero cards — the "rebuild off-screen, fade in" invariant is broken and we're rendering an empty state mid-transition. A MutationObserver plus a 20 ms poll records over 1700 samples; if any reports zero cards, the test fails.

03

never-lose-thumb

Pass 45.8 s

After a cross-window drag, does the moved tab still show its captured thumbnail — or does it revert to a favicon placeholder?

Show test output (the assertion-level proof)
Loading…

Scenario

  1. Three test pages across two Chrome windows. All thumbnails captured.
  2. Remember the red card's <img src> (a data-URL).
  3. Real pointer-driven drag of the red card from window B onto window A.
  4. Re-read the red card's <img src> after the UI settles.

What to watch for

The red card arrives in window A's section still displaying a red image. The IDB row for the red URL is still a healthy data-URL (> 1 KB) under the new tab's ID.

Failure would look like

The card lands in its new section but shows a grey placeholder with just the favicon. This is the failure mode the README's "never lose thumbnails — heal from cache or getThumbnail retry" invariant protects against.

04

no-scroll-jump

Pass 46.1 s

If the grid re-renders because of an external Chrome event (tab moved, activated, etc.) while I'm scrolled halfway down, does the viewport snap back to the top?

Show test output (the assertion-level proof)
Loading…

Scenario

  1. 14 tabs in Tabby's grid at a 600 px viewport — vertical scroll is forced.
  2. Dismiss the welcome overlay, switch to grid view.
  3. Scroll to y = 662 (60% of max scroll).
  4. Fire chrome.tabs.move to shuffle tab order — triggers the debounced refresh in Tabby without changing focus.

What to watch for

Scroll position stays at 662 throughout the re-render. The cards reflow but the viewport doesn't jump. A 30 ms scroll logger confirms zero drift — all 51 samples read y=662.

Failure would look like

Scroll snaps to 0 or near-0 because scrollHeight momentarily contracts during the re-render and the browser clamps. The user would see the grid "jump to the top" every time a tab is opened or moved elsewhere in Chrome.

05

navigate-recapture

Pass 5.4 s

When a tab navigates to a new URL, does Tabby capture a fresh thumbnail — or does the old one stick around?

Show test output (the assertion-level proof)
Loading…

Scenario

  1. Open a red page, wait for Tabby to capture it.
  2. Navigate the same tab to a teal page.
  3. Re-activate the tab (triggers the capture path).
  4. Read IDB and fingerprint the stored data-URL.

What to watch for

After the navigation, the stored thumbnail matches the new teal content, not the old red. Fingerprinted via three 64-char slices from the middle of the base64 payload — a content compare, not a metadata compare.

Failure would look like

Tab shows teal content, but Tabby's card shows a red thumbnail. Users would see stale visuals for any page they've navigated away from. Note the naive approach of comparing dataURL prefixes doesn't work — JPEG headers are identical across all images.

06

oob-flow

Pass 6.4 s

On a fresh install, does the welcome → refresh → congratulations → confetti → grid sequence complete end-to-end without getting stuck?

Show test output (the assertion-level proof)
Loading…

Scenario

  1. Fresh Chromium profile — Tabby auto-opens its NTP on install.
  2. Welcome dialog appears. Three test tabs opened in the background.
  3. Click "Refresh Thumbnails" → refresh popup window opens with progress UI.
  4. Popup closes when done → congratulations overlay with capture stats.
  5. Click "Start Using Tabby" → confetti plays → grid is populated.

What to watch for

Every stage reaches its successor without timing out. The final grid renders with cards present. Assertions are structural — they verify the flow, not the capture success count.

Failure would look like

Stuck on any stage — e.g., refresh popup never closes, congratulations never appears, or confetti plays but grid never populates. Note: under Xvfb the capture count reads 0/N because captureVisibleTab needs real OS window focus; that's not a regression and is covered by navigate-recapture separately.