# Team QA Playbook — make sure it actually works

**TrueBalance × TrueCredits Hackathon 2026 · class: `playbook`**

---

## TL;DR

Your app demos fine on **your** laptop — then a judge opens the URL and it spins forever, shows a blank screen, or 500s on the second click. You don't need more features. You need a **20-minute QA pass**: drive the real journey, break it on purpose, paint the failure states.

**Prime directive — test the user journey through the live app, not just your API.**
A `200 OK` does **not** mean the screen rendered, the button fired, the result is correct, or the page survives a refresh. The bug that loses you the demo lives *between* "the API returned 200" and "the screen shows the right thing."

Feed this page to your Claude session and say:

> **"Read this QA playbook and run a self-QA pass on my app at `<my-url>` — start with the Quick pass. Find what breaks before the judge does."**

> 🤖 **Reading this as an AI / Claude session?** The **Quick pass (§A)** below is enough for most hackathon apps — run those 6 checks first. Only drop into the **Full handbook (§B)** when a specific check needs depth (emulator/adb commands, backward-compat specifics, cross-platform decisions, the gating rules). Don't load §B unless a check calls for it.

---

# §A · Quick pass — the 20-minute version

The whole deal: **judge the journey → run the 6 checks → 60-second pre-demo check.** Each check is ~3 minutes, and each is something a judge *will* do to you.

### The one rule

A green `curl` is necessary, not sufficient. Open your **real deployed URL** (`https://<your-slug>.hackathon.afinit.dev`) in a fresh browser and actually walk the demo flow end to end. If you only tested in Postman / `curl`, you have **not** tested your app.

> If the URL shows the **"warming up"** page, your app isn't answering on port `8080` yet — that's a deploy problem. Fix it via the deploy playbook (`/playbooks/deployment`) first.

### The 6 checks

1. **Cold-start as a stranger** — open your URL in an **incognito window**: no cached login, no `localStorage`, no warmed-up backend. Does the first screen load in a few seconds, or hang / blank / throw? Judges arrive cold; your laptop is warm. This is the #1 way live demos die.
2. **Walk the golden path twice** — run your headline flow start → finish. Then do it **again without reloading**. Stateful bugs (double-submit, "already exists", stale cache, a counter that never resets) surface on run #2 — exactly when you're demoing live.
3. **Happy + one ugly input per step** — for every field/button, try the happy value **and** one bad one: empty, huge, emoji, wrong type, the browser **Back** button, a double-click. The app should reject it *gracefully*, never white-screen or 500. "What if the judge types nothing and hits submit?"
4. **Every screen has empty / loading / error** — no spinner that spins forever, no blank where data failed, no raw stack trace on screen. Show a **loading** state, an **empty** state ("nothing here yet"), and a **human** error message. *(This is design playbook move #6 — borrow `--tb-text-sub` for placeholders, `--tb-error` for failures.)*
5. **Mobile + desktop** — judges open links on a phone. Resize to ~**390px** wide (or open it on your phone): does the layout hold, are buttons tappable, does anything run off-screen? At minimum, your **demo path** must survive both widths.
6. **Refresh & deeplink survival** — hit **F5** in the middle of the flow, and paste your "share this" URL into a new tab. Does state survive, or does the user land on a broken / empty screen? Cold-loadable URLs are exactly what a judge clicks.

> **Got real data wired in?** Re-run the golden path against a **fresh** record, not the one row you've hand-tested all day. The demo always uses a new user.

### 60-second pre-demo check ✅

- ✅ Opened in a **fresh incognito window** — first screen loads, no blank / hang?
- ✅ Golden path runs **twice in a row** without a reload?
- ✅ Every screen shows **loading / empty / error** instead of a spinner-of-death or a blank?
- ✅ One **bad input** per form is handled (no white-screen, no 500)?
- ✅ Works at **mobile width** and **survives a refresh**?

Five ✅ = demo-proof. Nothing loses a room faster than a blank screen on the judge's first click.

---

# §B · Full handbook — the 9-card engineering self-QA

> The complete version, for when a check needs depth. Source: the internal **AI Self-QA** handbook (engineer-grade, modeled on the TFS Deployment & Change cards). At the hackathon **you are the QA** — the gate isn't "hand off to QA," it's *"before a judge clicks it."* Use the cards your build actually touches; a static web demo can skip the adb/prod-ticket bits, a team rebuilding the real loan journey will want all of them.
>
> Each card is **Goal / Trigger / Self-check / AI prompt / Pass criteria** — designed so you (or an AI assistant) answer it in minutes.

## Card 1 — API smoke (happy + 1 error)

| | |
|---|---|
| **Goal** | At least the happy path and one error path of every endpoint you touched return the correct status + body shape. |
| **Trigger** | After deploying to your box. |
| **Self-check** | `curl` / Postman / REST-client every endpoint your app exposes or calls; verify the status code, the result field, and at least one assertion on body content. |
| **If you build on the loan/repayment journey** | The existing Postman collections cover the full TB/TC loan journey + CMS repayment end to end (shared in `#all_eng_internal`) — reuse them instead of writing your own. Otherwise, smoke **your** endpoints. |
| **Trap** | A `200` with `data.result: "FAIL"` or `data.code: 50001` is **NOT** a pass — inspect the wrapped body, not just the HTTP status. |

## Card 2 — E2E on the app *(critical — do not skip)*

| | |
|---|---|
| **Goal** | Drive the *user* journey through the actual app on a device/emulator/browser. API-only passes do not count. |
| **Trigger** | After API smoke, before you show anyone. |
| **Self-check** | Cold-start the app, sign in as the test user, walk through every screen affected by your change. Screenshot each step. For deeplinks / push: trigger them from outside the app and confirm they land on the right screen. |
| **Why mandatory** | API responses can be perfect while the app silently mis-renders, mis-navigates, caches stale data, breaks the back-stack, or fires the wrong analytics event. Most "the API was fine in Postman" stories end in a broken demo. |
| **AI prompt** | "Given screen X's intent and these screenshots, list visible mismatches (copy, alignment, missing button, wrong currency format)." |
| **Coverage minimum** | All NEW screens + every screen one hop away from a modified screen. |

> **Web app (most hackathon teams)?** Your "device" is a browser — skip the adb/emulator how-tos below; cold-load your deployed URL in incognito (Quick-pass check 1) and walk it. The how-tos are for teams running the **real TB/TC Android build**.

### Card 2.a — Launch an emulator (Android builds only)

```bash
# 1. List AVDs configured locally
$ANDROID_HOME/emulator/emulator -list-avds          # -> Medium_Phone_API_36.0, Pixel_7_API_32, ...

# 2. Boot one in the background (no audio, fast network)
$ANDROID_HOME/emulator/emulator -avd Medium_Phone_API_36.0 -netdelay none -netspeed full > /tmp/emu.log 2>&1 &

# 3. Wait until boot completes
adb wait-for-device
until [ "$(adb -e shell getprop sys.boot_completed 2>/dev/null | tr -d '\r')" = "1" ]; do sleep 3; done

# 4. Confirm
adb devices                                          # -> emulator-5554   device
```

### Card 2.b — Install your build via adb (Android builds only)

```bash
adb devices                                          # 0. list devices / emulators
adb uninstall com.balancehero.true_balance           # 1. (optional) clear dirty state
adb install -r -d -t -g path/to/app-debug.apk        # 2. -r reinstall keep data, -d allow downgrade, -t test APK, -g grant perms
adb shell dumpsys package com.balancehero.true_balance | grep -E "versionName|versionCode"   # 3. verify
adb shell monkey -p com.balancehero.true_balance -c android.intent.category.LAUNCHER 1        # 4. cold-launch
adb logcat -v time | tee /tmp/app.log                # 5. stream logs while testing
adb exec-out screencap -p > screen.png               # 6. screenshot any time
```

> Multi-module builds (AAB / split APKs): use `bundletool install-apks`, or share a universal `.apk` from CI.

## Card 3 — Cross-platform coverage

| | |
|---|---|
| **Goal** | Confirm whether the change needs to run on platforms beyond the one you build on — Android, iOS, Web. |
| **Trigger** | Pre-demo / pre-PR. |
| **Self-check** | List every surface (Android, iOS, Web) that consumes the modified endpoint / screen / SDK. For each, decide: (a) needs an identical test here; (b) inherits behavior, no extra test; (c) explicitly out of scope. |
| **AI prompt** | "Given my diff, list every client (Android / iOS / Web) that calls the modified module or endpoint, and for each say whether a platform-specific E2E run is required." |
| **Pass if** | Every required platform has Card 2 (E2E) done. |

## Card 4 — Impact on other features

| | |
|---|---|
| **Goal** | Identify and validate any feature your change can affect, even if it wasn't in the original requirement. |
| **Trigger** | After the primary flow passes Card 2. |
| **Self-check** | For each touched function / endpoint / shared resource, list other features that call it or depend on its output, and walk through each in the app. |
| **AI prompt** | "List all features that depend on {function/endpoint/shared resource}. For each, describe the user journey and how to spot-check it." |
| **Pass if** | No new failures in any dependent feature. |

## Card 5 — Backward compatibility

| | |
|---|---|
| **Goal** | Existing clients (older app versions, existing API consumers, existing DB rows) keep working after this change. |
| **Trigger** | Any change to a public API contract, persisted schema, shared model, or config key. |
| **Self-check** | • **API contract**: existing fields not removed/renamed/retyped; new fields optional with safe defaults; enums added but not removed/reordered. • **Old app versions**: behavior degrades gracefully on older `versionCode`s still live. • **DB**: schema changes additive (no DROP / non-nullable-without-default / rename); migrations reversible. • **Config / flags**: new keys default to existing behavior. • **Payloads**: deserializers tolerate unknown fields. |
| **AI prompt** | "Given the diff in {file/migration/proto}, list every backward-compatibility risk: removed/renamed/retyped fields, narrowed enums, dropped columns, non-defaulted NOT NULL adds, behavior changes for unset config." |
| **Pass if** | Old client + new server works; old server + new client works during rollout; old DB rows still parse. |
| **Hackathon note** | Mostly irrelevant for a 2-day throwaway — matters only if you extend the *real* TB/TC stage backend that existing clients hit. |

## Card 6 — Regression sanity (5-minute golden path)

| | |
|---|---|
| **Goal** | The 3-5 most critical journeys (sign-in, apply, disburse, repay, foreclose — or *your* app's equivalent) still work end to end. |
| **Trigger** | Before merge / before you demo. |
| **Self-check** | Run the existing E2E suite against your branch, or do a manual app run if your change touches the journey. |
| **Pass criteria** | Zero new failures. Known flakes documented. |

## Card 7 — PR review *(gate, if you're merging into a shared repo)*

| | |
|---|---|
| **Goal** | The PR is reviewed + approved by at least one peer / AI reviewer before it's relied on. |
| **Trigger** | After Cards 1-6 are ✅. |
| **Self-check** | (a) PR description lists covered behaviors + Cards 1-6 evidence; (b) ≥1 human approval; (c) AI code review run (`/ultrareview` or equivalent) — every CRITICAL finding addressed. |
| **Pass if** | ≥1 approval, no unresolved CRITICAL findings, CI green. |
| **Hackathon note** | Skip if you're on a throwaway team repo; keep it if you're sending a PR into a real `TechSquad/*` repo. |

## Card 8 — Production-hot-path readiness

| | |
|---|---|
| **Goal** | Extra guardrails before anything touches real users / prod. |
| **Trigger** | Only if deploying to prod (not a normal hackathon path). |
| **Self-check** | (a) feature flag default OFF; (b) one-line rollback documented; (c) on-call paged; (d) dashboard URL in the deploy ticket; (e) BCM/TCM/SIM tickets attached. |
| **Reference** | TFS "Deployment Ticket (BCM/TCM) Approval Workflow" — `#all_eng` handbook. |
| **Hackathon note** | Not part of the hackathon — your box is torn down after the event. Here for completeness / when this playbook is reused for real work. |

## Card 9 — QA / demo handover note

| | |
|---|---|
| **Goal** | Whoever picks this up next (QA, a teammate, or *you* at the demo) shouldn't have to re-discover what's tested. |
| **Trigger** | When you call it done. |
| **Self-check** | Write one note listing: which behaviors you covered, which devices/envs/platforms you tested, screenshots from Card 2, and what you intentionally did **not** cover. At the hackathon this is your **demo run-sheet** (the exact happy path you'll click on stage). |
| **AI prompt** | "From my self-QA notes, write a 6-bullet handover / demo run-sheet." |

---

## Why Card 2 ("E2E on the app") is the hardest non-negotiable

Most incidents traceable to a code change pass API testing but fail user-app testing. Common patterns:

- API returns `200`, app shows a blank screen because the response shape changed.
- API rejects with `code 50001` but the action silently succeeded on the backend.
- Deeplink resolves to the wrong screen after a cold start.
- Form validation works in stage WebView but fails on a real device.
- A copy change passes review but doesn't pluralize correctly in Hindi.

Running through the actual app catches all of these in ~10 minutes. Skipping it just shifts the cost to the demo — in front of the judges.

---

*Pairs with: deploy playbook (`/playbooks/deployment`, get it live first) and design playbook (`/playbooks/design`, make it look intentional). Ship -> paint -> prove it works.*