§ CS-88/cursor-regression-loop-resolved-for-healthtech

Healthtech (outpatient clinical workflow) · Cursor · Break-the-Loop Refactor

Cursor regression loop fix — 11 bugs/week to 1 and shipped to 500 clinicians

Cursor regression loop fix for Charthealth, a four-person healthtech team building chart-review tooling for outpatient clinicians. Every Cursor prompt broke something else, architectural memory dropped past file seven, 312 Vitest tests were green 40% of the time by coincidence, and PHI was logging to Sentry unredacted. The Break-the-Loop Refactor codified the architecture in ESLint + TypeScript rules, rewrote 88 deterministic tests, fixed the PHI handling with a Sentry beforeSend hook, and moved audit logging to a Postgres trigger. User-reported bugs: 11/week → 1/week. Shipped to 500 clinicians across nine clinics inside the next month.

Free rescue diagnostic →See similar rescues

updated April 12, 2026/11 min read/by Hyder Shah/client · Charthealth (name changed)

§ CS-88.1/headline-numbers

1
User-reported bugs per week: 88 tests, 100% green CI
Test suite size / reliability: 0
PHI fields in Sentry events: 4 of 4 (DB trigger)
Audit log coverage

§ CS-88.2/client-context

About Healthtech (outpatient clinical workflow) client

Charthealth (name changed) is a healthtech (outpatient clinical workflow) team at the pre-revenue, 3 pilot clinics signed → 500 clinicians across 9 clinics stage. They built their product with Cursor and shipped it to pilot users before discovering that the generated scaffolding masked a set of production-grade failures. The engagement that followed was scoped as Break-the-Loop Refactor ($3,999 fixed fee).

Stack before

Cursor (agent mode)Next.js (ad-hoc)SupabaseAd-hoc Vitest (failing)

Stack after

Cursor (scoped prompts)Next.js (feature-sliced)Supabase (RLS + audit log)Vitest + Playwright (CI-gated)Datadog

§ CS-88.3/day-zero-autopsy

Audit findings on day zero

What the first production-readiness pass uncovered before a single line of code was changed. Each finding is a specific Cursorfailure mode we’ve seen repeat across engagements.

F01
The fix-one-break-another loop
The lead developer described it exactly as the Medium review does: "The filter worked, but the table stopped loading. I asked it to fix the table, and the filter disappeared." Every agent-mode run destabilised something elsewhere. The team was shipping one step forward and one step sideways for six straight weeks. The frustration compounded because each individual prompt felt like progress — Cursor was visibly making changes and the changes typically did fix the requested bug — but the regression rate meant net velocity was effectively zero. Several team members independently considered quitting and reported the issue as the primary reason.
F02
Architectural memory loss past file seven
Cursor's agent, when given a multi-file task, would forget the conventions established in the first few files by the time it reached the seventh. The auth middleware pattern set in file two was silently re-invented in file nine — with a subtly different permissions check.
F03
Fragile test coverage (no real signal)
There were 312 Vitest tests. 47% were flaky, 29% asserted on implementation details that changed every prompt, and 11% were disabled outright. CI was green roughly 40% of the time by coincidence.
F04
HIPAA-adjacent security gaps
PHI fields (patient name, date of birth, diagnosis code) were logged to Sentry without redaction. The audit log table existed but was only written from one of four write paths. A clinician's session token was being stored in localStorage instead of an httpOnly cookie.
F05
No way to onboard a second engineer
The founder wanted to hire. Three candidates quit the take-home after reading the codebase. The files that were most Cursor-regenerated had the least consistent patterns.

§ CS-88.4/root-cause-analysis

Root cause of the Cursor failure mode

Cursor's agent mode is a powerful local optimiser with no global memory. Given a file, it will make the file better; given a codebase, it will make each file better in a slightly different direction. The causal chain: agent mode rewrites without a style/architecture contract → every rewrite introduces small drift → tests are also agent-written so they drift with the code instead of anchoring it → regressions become invisible until a user reports them → the team prompts Cursor to fix the regression, which drifts something else. Breaking the loop isn't about Cursor — it's about giving Cursor an anchor it can't rewrite: a codified architecture, a test suite that asserts behaviour not implementation, and prompt discipline that scopes changes. The healthtech context made the loop especially expensive — every regression in clinical software is potentially a patient-safety incident, so each bug had to be fully investigated, root-caused, and documented before it could be triaged, even when the actual user impact was cosmetic. The team was spending three days of investigation work per real bug, and Cursor was generating new bugs faster than they could be closed. The break-even arithmetic was no longer working in favour of agent mode.

§ CS-88.5/remediation

How we fixed the Cursor rescue stack

Each step below is one remediation workstream from the engagement. In cases where the underlying data includes before/after code vignettes, those render inline; otherwise we describe the change in prose.

01
Codified the architecture in a 3-page ARCHITECTURE.md and a set of ESLint + TypeScript rules that fail the build on violations. The conventions are now machine-enforceable, so Cursor either produces compliant code or produces code that won't merge.
02
Reorganised the codebase into feature slices (patients/, encounters/, audit/, auth/) with explicit boundaries. Each slice exposes a typed public API; cross-slice imports outside that API are ESLint errors.
03
Deleted the 312-test suite and rewrote 88 tests from scratch — Vitest for pure logic, Playwright for the four clinical workflows that must never regress (chart open, note sign, prescription send, audit-export). All 88 are deterministic, all run in CI, all block merge on failure.
04
Fixed the PHI logging: added a redaction layer in the Sentry beforeSend hook, scrubbed 6 months of historical events via the Sentry API, moved session tokens to httpOnly secure cookies with SameSite=Lax.
05
Made the audit log a Postgres trigger, not an application-layer call. Every insert/update/delete on a PHI table writes to audit_events automatically. Verified by a pgTAP test that attempts writes through all four paths.
06
Wrote a 'Cursor playbook' for the team: scoped prompts (one file or one feature slice at a time), required test-first agent runs, and a PR template that fails if the architecture doc is violated. The team now uses Cursor without the regression tax.
07
Paired with the lead dev for a week on real tickets to lock the new workflow in. The next two engineering hires passed the take-home on the cleaned repo.
08
Wrote a HIPAA-aligned data-handling policy in plain language for the team's runbook: what counts as PHI, where PHI is allowed to live, what to do when PHI accidentally appears in logs (the answer is a documented redaction script that runs against historical Sentry events plus an updated beforeSend hook), and which pieces of Charthealth's stack require a Business Associate Agreement. The clinical lead now has a single document she can hand to a partner clinic's compliance officer.
09
Added an automated PR check that runs `tsc`, ESLint with the architecture rules, both Vitest and Playwright suites, and a custom script that fails if a PR touches a PHI table without also updating the audit log mapping. The check runs in under three minutes; the team's merge-velocity went up because reviewers stopped having to manually verify these things on every PR.

§ CS-88.6/founder-quote

Sample client perspective — composite, not an individual testimonial

“Cursor was giving us speed we couldn't cash in. Every fix undid something. Afterbuild Labs didn't tell us to stop using Cursor — they gave us rails so Cursor couldn't drift the codebase underneath us. We shipped to five hundred clinicians the month after.”

Dr. Priya Varma· Clinical lead & co-founder, Charthealth

§ CS-88.7/outcome-delta

Outcome after the resolved rescue

Every metric below was measured directly — RLS coverage via pgTAP, webhook success via Stripe dashboards, response times via production APM, MRR via Stripe billing.

Before / after — Healthtech (outpatient clinical workflow)

Metric	Before	After
User-reported bugs per week	11	1
Test suite size / reliability	312 tests, ~40% green CI	88 tests, 100% green CI
PHI fields in Sentry events	4 (names, DOB, dx codes)	0
Audit log coverage	1 of 4 write paths	4 of 4 (DB trigger)
Clinicians using the app	12 (pilot)	500 across 9 clinics
Take-home completion rate (new hires)	0 of 3	2 of 2
Avg time to fix a real bug	~2 days (loop)	~3 hours

§ CS-88.8/engineer-note

Engineer retrospective — technical lesson

“We'd write the Cursor playbook first, not last. The team was still prompting in old habits through week one and generated two more regressions we then had to unwind. The lesson generalises: in a regression-loop rescue, behavior change has to come before architecture change, otherwise the architecture you just built starts drifting on day two.”

Hyder Shah· Founder, Afterbuild Labs

→We'd involve a HIPAA compliance reviewer earlier. We caught the PHI-in-Sentry issue ourselves but we'd have preferred an external second set of eyes before shipping to nine clinics. A formal review on day three would have surfaced two additional minor issues — a cookie attribute and an audit-log retention setting — that we instead caught on day twelve.
→We'd measure the bug-per-week baseline for two weeks before touching code. We have before/after numbers but the 'before' is the founder's recollection; a cleaner measurement would have been stronger proof. For the next regression-loop engagement we now require a baseline-measurement period before the project clock starts.
→We'd ship the architecture documentation as a Loom-walkthrough alongside the Markdown file. The team reads docs unevenly; a 12-minute video tour of the new feature-slice boundaries would have onboarded the contractor and the two new hires faster than the written doc alone managed to.

§ CS-88.9/replicate-this-rescue

How to replicate this Cursor rescue

The same engagement path runs across every healthtech (outpatient clinical workflow) rescue we take on. Start with the diagnostic, then route into the service tier that matches the breakage surface.

service

AI-Generated Code Cleanup →

The Break-the-Loop Refactor engagement this project used.

service

AI App Rescue →

For broader breakage beyond the regression loop.

service

Auth, Database & Integrations →

RLS, audit logs, session hygiene for regulated apps.

service

Ongoing Maintenance →

Keep the rails in place as your team grows.

Tool page / Hire Cursor developers →

§ CS-88.10/related-rescues

Similar healthtech (outpatient clinical workflow) rescues

Browse the full archive of Cursor and adjacent AI-builder rescue write-ups.

See all case studies →

§ CS-88.11/industry-deep-dive

Related industry deep-dive

vertical · healthtech (outpatient clinical workflow)

Read more healthtech rescue patterns →

PHI-in-Sentry exposure, incomplete audit logs, and session-token hygiene are the recurring healthtech failure modes this regression-loop rescue surfaced. The vertical page walks the HIPAA-aligned production-readiness checklist we apply on every clinical workflow — from BAA scoping to audit-trail triggers and PHI-safe logging.

Next step

Got a broken Cursor app that looks like this one?

Send the repo. We'll tell you what it takes to ship — in 48 hours, fixed fee. Free diagnostic, no obligation.

Book free diagnostic →

Cursor regression loop fix — 11 bugs/week to 1 and shipped to 500 clinicians

About Healthtech (outpatient clinical workflow) client

Audit findings on day zero

The fix-one-break-another loop

Architectural memory loss past file seven

Fragile test coverage (no real signal)

HIPAA-adjacent security gaps

No way to onboard a second engineer

Root cause of the Cursor failure mode

How we fixed the Cursor rescue stack

Outcome after the resolved rescue

How to replicate this Cursor rescue

Similar healthtech (outpatient clinical workflow) rescues

Related industry deep-dive

Got a broken Cursor app that looks like this one?