_system / plans · companion to cloud-run-migration-sketch

Cloud Run + Google Workspace — Beneficiary Care subsystem migration sketch

Research-level draft · 2026-05-28 · trial-shape approved 2026-05-29 · code-complete 2026-05-29.

Target GCP project: ‹gcp-project› · Workspace: phamj.com (paid) · Subsystem: Area 1, per beneficiary-care-subplan.md

✔ Built — trial shape (2026-05-29)

The trial-shape subsystem in this sketch is now code-complete (Cloud Run “Đợt 3”). The B1–B7 care tools and the dashboard were built 2026-05-28; the cloud lift landed 2026-05-29:

What remains is operational, not code: the administrator runs the deploy script, and the D-zero pilot (one school, a handful of real children, walked end-to-end) is still the unlock for any real cloud commit. Two tuning calls stay open (the auto-convert confidence threshold; whether to enable the Drive change-push for intake). The flows below are unchanged from the design; this banner records that they are now implemented.

⚠ Trial-stage override (Editor review, 2026-05-29)

This sketch was drafted 2026-05-28 against an earlier set of assumptions (two Shared Drives, a separate care service account, five user roles, an IAP allowlist synced from Drive membership, a full triage dashboard). The Editor's 2026-05-29 review — the same review that produced cloud-run-migration-v1 for the media line — collapses the Area-1 trial to a much smaller shape. Where this banner and a flow conflict, the banner wins. The legacy text below is kept struck-through for the record of what changed and why, and reverts when the Foundation grows past one operator.

  1. One Workspace, one Drive, one folder. No second Shared Drive. Care lives in a restricted folder 10_Ho-so-thu-huong/ on the single existing Foundation Shared Drive (phamj.com); the access boundary is folder-level permission, not a separate Drive. (Overrides Delta 1, §5 Layer 1's second Drive, the hmt-private-drive-id secret.)
  2. One service account. Reuse ‹foundation-service-account›; no hmt-care-sa, no care@ impersonation subject for the trial. A separate care SA is the documented exit when Care reaches production. (Overrides §5 Layer 1.)
  3. One administrator. One person holds every hat (coordinator = editor = safeguarding = principal = trustee). The five-role model, the audience filter, the per-role audit pages, and the Apps-Group routing are all deferred. (Overrides §5 Layer 3, §6.5 of the subplan, Flow D's "separate person.")
  4. Login is a simple user/password. For a single local operator, the dashboard sits behind one login, not a Drive-membership→IAP-allowlist sync job. (Overrides §5 Layer 3, iap-allowlist-sync-job.)
  5. Intake auto-converts (Flows A & B). Drop → auto-sanitise → auto-convert the raw file into a Markdown internal-memory entry (raw + md kept together) → register. No per-note human commit. A human review fires only when the system cannot make sense of the input (unreadable, un-extractable, low quality). This is a deliberate Phase-A relaxation scoped to the internal store: Phase-A stays absolute at the output boundary (nothing about a child leaves the workspace without /approve). Mirrors the gate-8 single-administrator relaxation in qc-enforcement.md/safeguarding.md §3a. (Overrides Flow A steps b–g, Flow B's review-before-commit, §11's "no automated profile write.")
  6. Watchlist is an alert, not a triage console (Flow D). The monthly job runs and sends the administrator a notification; the four-action dashboard queue is deferred.
  7. Reports render reader-friendly HTML (Flow E). A styled HTML page is the thing the administrator reads; Markdown + PDF are kept underneath as the regenerable audit copy.
  8. Sponsorship view parked (Flow F). Deferred until the Area-3 build; out of trial scope.
  9. Benefit ledger (Flow C) stands as written, scope to expand later.

1. Why a separate sketch from the media-line plan

The media-line cloud-migration sketch (cloud-run-migration-sketch) handles the Area-2 storytelling pipeline. This document handles Area 1 — Beneficiary Care. They share a GCP project, a Workspace, and the same auth pattern, but the workloads, data sensitivity, user roles, and cadence are different enough that one document covering both would obscure both.

The contrasts that matter:

The good news: every cloud primitive used in the media-line plan applies here too. So this sketch concentrates on what is different and treats the shared infrastructure (Artifact Registry, Secret Manager, Cloud Logging, Workload Identity Federation) by reference.

2. Where Area 1 is today

Two simultaneous baselines, per the subplan:

Everything that runs today runs on the operator's Windows box, against an in-repo placeholder directory. No real children's data has entered the system yet. Real data has never been on the operator's machine and is not intended to be — the operational store is the restricted 10_Ho-so-thu-huong/ folder on the one Foundation Shared Drive, separate from this repo, gitignored in the placeholder. The data files live on Drive when provisioned; the same tools drive both surfaces via path flags.

That last point matters for the migration. The cloud move is not "lift the laptop into Cloud Run." It is "give the existing tools a real home with the access controls the subplan already specifies." The architectural seams are clean enough to do this without a rewrite.

3. Six things that make this subsystem different

If you have read the media-line sketch, here is the delta in one screen.

Delta 1 — A restricted folder, not a second Drive (trial). Every child's record lives in the folder 10_Ho-so-thu-huong/ on the one Foundation Shared Drive, gated by folder-level permission to the administrator. The existing service account (hmt-media-ingest) reaches it via workload identity — no second Drive, no second SA, no hmt-private-drive-id secret. (Team-scale target: a separate private Drive + a dedicated hmt-care-sa, restored when Care reaches production / a second person joins.)
Delta 2 — The SQLite (so-dang-ky.sqlite) is a projection, not a source. The canonical record is the markdown profile + ledger files in the 10_Ho-so-thu-huong/ folder on the Drive. The SQLite is a denormalised projection rebuilt from those files, kept in sync at write time by the convert/record tools. It lives in GCS via gcsfuse, separately from the markdown source. Loss of the SQLite is operationally annoying but not data loss; rebuild from the markdown.
Delta 3 — The dashboard is a real service. Server-rendered FastAPI + HTMX. Trial: one simple user/password login; one administrator view, no per-role audience filter. Inline data entry that writes back through the same tools the cron jobs use. (Team-scale target: Google-Workspace login via Cloud IAP, allowlist synced from Drive membership, role-aware views — deferred.)
Delta 4 — Cron dominates, push is a small fraction. Nightly index rebuild, nightly consistency check, monthly watchlist, daily keyword watchlist, monthly Area-4 emission, quarterly archive. The only push surface is Drive change notifications on Tai-lieu-vao/ — and even those are convenience (a Coordinator drop does not have to be picked up within seconds; Monday-morning review is the natural cadence).
Delta 5 — PDF + OCR + structured extraction is a first-class workload. School documents (hoc-ba, so-diem, chuyen-can, hop-phu-huynh) arrive as PDFs, get fed to Claude Opus as document input, return Pydantic-validated structured fields. Trial: auto-commit on a clean, in-bounds extract to the school_report table + a Block-3 note; out-of-bounds values and multi-child sheets flag a human (Flow B). No equivalent in the media line.
Delta 6 — Phase A here lives at the OUTPUT boundary. Auto-convert intake (Flows A & B) does write profile entries and benefit/school-report rows unattended — the internal beneficiary memory is not an egress. The bright line is at the place data actually leaves: nothing about a child reaches a public or external surface without an explicit human /approve. Two backstops keep auto-write honest: flag-on-failure (bad input flags a human, never a half-baked entry) and the watchword carve-out (a safety signal still surfaces same-day). The Stream-C leakage check (gate 8.5) and the gate-8 dignity sign-off stay manual. (Team-scale target: a per-note human commit; deferred.)

4. Target architecture on ‹gcp-project›

Trial shape (2026-05-29): one Workspace, one Shared Drive. Care is not a second Drive — it is a restricted folder (10_Ho-so-thu-huong/) on the single existing Foundation Shared Drive, gated by folder-level permission. Same project, same region, one service account, additive on top of the media-line components.

┌──────────────────────────────────────────────────────────────────┐ │ Google Workspace (phamj.com) — ONE Shared Drive │ │ HMT Foundation Shared Drive (id ‹shared-drive-id›) │ │ │ │ MEDIA folders (existing) ▶ CARE folder ◀ ★ │ │ - 01_Tai-lieu-thuc-dia/ (10_Ho-so-thu-huong/, restricted │ │ - 03_Tai-san-thuong-hieu/ folder-level permission — only │ │ - 08_Phieu-thong-tin-bai-dang/ the administrator) │ │ - 10_Ho-so-thu-huong/Ho-so/ │ │ - 10_Ho-so-thu-huong/So-phuc-loi/│ │ - 10_Ho-so-thu-huong/Tai-lieu- │ │ vao// │ │ - 10_Ho-so-thu-huong/Bao-cao/ │ │ - 10_Ho-so-thu-huong/Chi-muc/ │ │ - 10_Ho-so-thu-huong/Danh-sach- │ │ an-toan/ │ │ - 10_Ho-so-thu-huong/Goc-nhin- │ │ nha-tai-tro/ (parked, Area-3) │ │ │ └──────────────────────────────┬───────────────────────────────────┘ │ Drive API (ONE SA: hmt-media-ingest, │ workload identity — no key download) ▼ ┌──────────────────────────────────────────────────────────────────┐ │ GCP project: ‹gcp-project› region: asia-southeast1 │ │ │ │ ┌─ Cloud Run service ─────────────────────────────┐ │ │ │ hmt-media-webhook (existing, from media plan) │ │ │ │ POST /drive-events POST /chat-events │ │ │ │ GET /review/* │ │ │ └─────────────────────────────────────────────────┘ │ │ │ │ ┌─ Cloud Run service ─────────────────────────────┐ ★ │ │ │ hmt-care-dashboard (NEW — FastAPI + HTMX) │ │ │ │ / /child/ /school/ │ │ │ │ /watchlist /reports /record /benefit │ │ │ │ TRIAL: one simple user/password login │ │ │ │ (no IAP allowlist, no per-role filter) │ │ │ │ min-instances=1 max-instances=3 │ │ │ └──────┬──────────────────────────────────────────┘ │ │ │ enqueue (Jobs Executions API) │ │ ▼ │ │ ┌─ Cloud Run jobs ────────────────────────────────┐ │ │ │ intake-convert-job ★ (auto-convert, A & B) │ │ │ │ school-doc-extract-job ★ │ │ │ │ index-rebuild-job ★ │ │ │ │ consistency-check-job ★ │ │ │ │ watchlist-monthly-job ★ (alert, Flow D) │ │ │ │ watchlist-keyword-job ★ │ │ │ │ report-child/cohort/school-job ★ (HTML, E) │ │ │ │ area4-emit-job ★ │ │ │ │ sponsorship-render-job (parked, Flow F) │ │ │ │ iap-allowlist-sync-job (DEFERRED — one login)│ │ │ │ drive-membership-audit-job (DEFERRED) │ │ │ └──────┬──────────────────────────────────────────┘ │ │ │ gcsfuse mount │ │ ▼ │ │ ┌─ GCS buckets ───────────────────────────────────┐ │ │ │ hmt-bundles (media; existing) │ │ │ │ ‹gcp-project›-corpus (Brand; existing) │ │ │ │ ‹gcp-project›-care-db (so-dang-ky.sqlite, │ ★ │ │ │ care cache, render │ │ │ │ cache) │ │ │ └─────────────────────────────────────────────────┘ │ │ │ │ Secret Manager: anthropic-api-key google-chat-token │ │ (ONE drive id reused; no separate hmt-private-drive-id) │ │ │ │ Cloud Scheduler: │ │ nightly: index-rebuild (02:00), consistency-check (02:30), │ │ care-cache-refresh (04:00) │ │ daily: watchlist-keyword (07:00) │ │ monthly: watchlist-full (1st @ 06:00), │ │ area4-emit (28th @ 06:00) │ │ (Trial: auto-convert MAY write the internal store; NO job ever │ │ pushes a child to a public/external surface — see §11) │ │ │ │ Artifact Registry: hmt-pipeline (one image, many entrypoints) │ │ │ │ Dashboard auth (TRIAL): one simple user/password login │ │ (IAP + Drive-membership allowlist = team-scale, deferred) │ │ │ │ Cloud Logging: structured JSON; per-role audit subset for the │ │ Safeguarding lead's monthly review │ │ │ └──────────────────────────────────────────────────────────────────┘

The dashboard is the centre of operational gravity for Area 1. Trustees and Principals never touch Claude Code; the dashboard is their entire interface. The Coordinator uses both Drive (for raw drops + reading approved profiles) and the dashboard (for promotion, school-doc review, benefit entry, sponsorship-view generation). The Safeguarding lead lives in the dashboard's /watchlist + child profile pages.

5. Auth model: one Drive folder, one SA, one login (trial)

Trial shape (2026-05-29). The three-layer model below is the team-scale target, kept for the record. For the one-administrator trial it collapses hard: (Layer 1) no separate hmt-care-sa — reuse hmt-media-ingest via workload identity, same as the media line; (no second Drive) Care is the restricted folder 10_Ho-so-thu-huong/ on the one Foundation Shared Drive, gated by folder-level permission, so there is no care@ subject and no hmt-private-drive-id secret; (Layer 3) one simple user/password login on the dashboard, no IAP allowlist sync, no Apps-Group role routing, no audience filter (one administrator sees everything). Each piece below lights up when the Foundation grows past one operator — the seams are left in place so it is additive, not a rewrite.

Trial auth (current)

Team-scale three-layer model (deferred — restored when a second person / production Care arrives)

The three-layer model from the media-line sketch is the eventual target. What it adds at team scale is a tighter access boundary on the care folder and a role filter inside the dashboard.

Layer 1 — GCP IAM

One additional SA: ‹care-service-account› — runs the dashboard + care jobs. Roles: Secret Manager Accessor, Storage Object Admin (scoped to the care bucket only), Logging Writer, Run Invoker on the care jobs. Rationale: blast-radius separation from the media-line SA.

Layer 2 — Workspace Domain-Wide Delegation

One additional DWD entry, same six scopes from tools/gworkspace_check.py, a dedicated impersonation subject (e.g. care@) added as Content Manager of a separate private Drive; pipeline@ not added to it. This is how the two pipelines isolate at the Workspace layer at team scale.

Layer 3 — Dashboard IAP + role filter

Cloud IAP gates the dashboard; allowlist synced daily from the private Drive's member list; roles from a role_assignments table (coordinator | safeguarding | principal | trustee | sponsor | editor); the §6.5 audience filter applied per response. The filter is a narrative filter, not access control; the Drive membership is the real boundary, with a penetration check at hardening.

6. End-to-end walkthroughs (six flows)

The media-line sketch had one canonical flow (an event from upload to post). Area 1 has six flows. For the trial they collapse onto one operator (the administrator wears every hat), and Flows A and B auto-convert with no per-note human commit — see the override banner at the top. Each walkthrough below carries a trial-shape note; the role tags (Coordinator / Safeguarding / Trustee / Principal) are the team-scale labels, all held by the one administrator for now.

Flow A — Administrator Prose drop → auto-converted memory entry

Trial shape (2026-05-29). The administrator's only deliberate act is the drop. Everything after is automatic: sanitise → convert → register, with no per-note review or commit click. The system reaches back for a human only when it cannot make sense of the input. The seven-step click-through below is the original team-scale design, kept for the record and for the day a second person joins.

The everyday flow. The administrator visits THCS Tân Khánh, writes up notes about journey-0042's reading progress, then drops the file when back at a laptop.

  1. Drop. They upload 2026-05-18__truong-1__tham-truong.md (a note, a .docx, or a direct text paste — not necessarily Markdown) into 10_Ho-so-thu-huong/Tai-lieu-vao/2026-W21/journey-0042/ on the one Foundation Shared Drive. Drive UI; phone or laptop.
  2. Auto-pickup. The scheduled intake-convert-job (and, if wired, a Drive changes.watch nudge) finds the new drop. No "optional push vs Monday review" distinction any more — the schedule run is the pickup.
  3. Auto-sanitise. The job runs media_sanitise.py on the artefact (EXIF/GPS strip for images; no-op for text). Safeguarding measure 1 still mandatory.
  4. Auto-convert. The job invokes the beneficiary-recorder agent (Claude Opus, ANTHROPIC_API_KEY from Secret Manager): raw drop in, a structured Block-3 memory entry in the canonical shape out. It infers the petal anchor (default petal 1 for a school visit), writes the entry to Block 3 of the profile via Drive API, and writes the matching profile_entry row. The raw drop is kept beside the entry; the entry's *(file gốc: ...)* line points back to it.
  5. Auto-register + validate. beneficiary_validate.py runs on the updated profile. On a clean pass the entry is live in the internal memory — done, no human touched it.
  6. Flag-on-failure (the only human path). If conversion fails its confidence/quality bar — the text is unreadable, can't be extracted to a coherent entry, the file won't open, OCR is garbage, or the agent returns low confidence — the job does not write a half-baked entry. It files a review request (10_Ho-so-thu-huong/Tai-lieu-vao/2026-W21/_needs-review.md + a notification) describing what it couldn't process, and leaves the raw drop in place for the administrator to fix or re-drop. Quiet on success; loud only on genuine failure.
  7. Safeguarding carve-out (kept). Independently of conversion quality, if the entry text trips a watchword (tu-khoa-canh-bao.yaml: đánh, bỏ học, bạo hành…), the entry is still written to memory but is also surfaced to the administrator (the safeguarding hat) the same day — see Flow D. Auto-write does not silence a safety signal.

Why auto-write is allowed here. The internal beneficiary memory is not an egress; writing a progress note to a private profile sends nothing outside the workspace. Phase A stays absolute at the output boundary (a Facebook post, a sponsor email, a website figure — all still need /approve). This mirrors the gate-8 single-administrator relaxation: the discipline moves to where data actually leaves, not to every internal keystroke. Reverts to a review-before-commit gate when a second person joins.

Legacy team-scale flow (struck; reverts when the team grows)
  1. Drop into the private Shared Drive; optional Drive changes.watch push increments a counter.
  2. Review. Monday morning the Coordinator opens /intake; IAP signs her in; dashboard reads the week's drops grouped by journey-id.
  3. One-line summary appended to _review.md; picks "promote".
  4. Promote launches promote-jobbeneficiary-recorder drafts a Block-3 entry.
  5. Review the draft side-by-side with the raw drop; edit; confirm petal; "commit".
  6. Commit via POST /record/promote/<journey-id>beneficiary_promote.py writes profile + SQL row + re-validates.

Flow B — Administrator School PDF → auto-extracted, flag-on-failure

Trial shape (2026-05-29). Same auto-convert principle as Flow A: drop the PDF, the system extracts and writes both the school_report row and the Block-3 note unattended. The mandatory side-by-side human verification is relaxed to flag-on-failure. The one nuance: a clean-looking-but-wrong OCR (a grade 7.5 read as 75) is the failure mode auto-convert can't fully catch, so the extractor leans on validation bounds (see below) and the confirmed_by field records auto when no human verified — making it queryable later if a number looks off.

The term-2 học bạ for journey-0042 arrives as a PDF.

  1. Drop. The administrator uploads 2026-04-10__hoc-ba-hk2.pdf into Tai-lieu-vao/2026-W15/journey-0042/. The __hoc-ba token routes it to the structured extractor.
  2. Auto-extract. school-doc-extract-job (scheduled, same pickup as Flow A) pulls the PDF via Drive API, sends it to Claude Opus as a document input with the hoc-ba Pydantic schema, gets back the structured record (gpa_overall, class_rank, subject_grades[], teacher_comment, attendance_pct, conduct_rating).
  3. Validation bounds (the auto safety net). Pydantic enforces ranges that catch the obvious OCR slips: a 10-point-scale grade must be 0–10 (so 75 fails and flags), attendance_pct 0–100, class_rankclass_size, conduct_rating in the fixed enum. A value outside bounds → the record is held and flagged, never committed.
  4. Auto-commit on clean extract. If the record validates and the model's confidence is high, the job writes one school_report row per child (SQLite, confirmed_by = auto), composes a short Block-3 note keyed to the fields, writes it to the profile via Drive API, links the two via school_report.profile_entry_id.
  5. Flag-on-failure. If extraction fails, a bound trips, confidence is low, or the document is a multi-child sheet (class-wide ranking/attendance — ambiguous which row maps to which HMT child), the job files a review request to _needs-review.md + notifies the administrator, with the PDF and the draft fields attached for a quick fix. The multi-child split stays a human decision because mis-attribution is worse than delay.
  6. Audit artefact. The PDF stays in Tai-lieu-vao/; the note footer and the row's source_pdf column both point to it.

The same flow handles so-diem, chuyen-can, hop-phu-huynh with their per-type schemas + bounds. Pydantic models live in tools/school_doc_extract.py.

Legacy team-scale flow (struck; mandatory side-by-side review)

Drop → click "extract" on the dashboard → side-by-side PDF + editable fields → scan for OCR slips → multi-child tabs reviewed one by one → click "commit" (POST /record/extract), confirmed_by = the Coordinator's name.

Flow C — Administrator Benefit delivered → ledger row

Trial shape (2026-05-29): stands as written, scope to expand later. This flow is a small typed form, not an intake conversion, so it stays human-entered — recording a benefit is the deliberate act, and the funding-source / amount / petal fields are exactly where a typo costs an Area-4 trust failure. The only edits: care.hmtfoundation.org.vn → the trial login URL, the path prefix 01_10_Ho-so-thu-huong/, and the operator is the administrator.

A scholarship cheque is handed over. The ledger records it the same day, because Area-4 transparency depends on the ledger being complete in real time.

  1. From the child profile page (trial dashboard, /child/journey-0042), click "record benefit".
  2. Fill the form: date, category (scholarship), petal (1), amount in VND (2,500,000), funding source (tet-2026-quy-cong-dong; dropdown validated against nguon-tai-tro.yaml), delivered by, handover signed by (Mẹ — Nguyễn Thị Hoa). Optionally upload the signed receipt PDF.
  3. Commit. Dashboard calls POST /benefit/record. Server-side: writes a markdown ledger entry into 10_Ho-so-thu-huong/So-phuc-loi/journey-0042/2026-04-12__hoc-bong-hoc-ky-2.md (Drive API), uploads the receipt into the same folder, writes a benefit row to SQLite with benefit_id = ben-2026-0042-003 (auto-sequenced). benefit_record.py does the validation and the writes (B3 surface).
  4. Done. Returns to the child profile page; the ledger panel shows the new row. The Area-4 monthly report draft picks it up automatically on the 28th.

Flow D — Administrator Monthly watchlist → alert notification

Trial shape (2026-05-29): an alert, not a triage console. The monthly signal job runs exactly as designed (§7 of the subplan — absence, followup-overdue, grade-drop, attendance-dip, keyword, etc.). What changes for one operator: the output is a notification the administrator reads, not a four-action dashboard queue with per-item dismiss/escalate/re-surface buttons. The administrator (wearing the safeguarding hat) reads the alert and acts in the ordinary intake flow. The queryable watchlist_item rows and the action buttons return when a second person takes the safeguarding hat.

The 1st of the month, after the schedule runs:

  1. Job runs. watchlist-monthly-job reads the registry + profiles, applies the §7 signal rules, writes 10_Ho-so-thu-huong/Danh-sach-an-toan/<YYYY-MM>__watchlist.md.
  2. Alert. The job sends the administrator one notification summarising the run: "Watchlist 2026-06: 7 signals — 2 absence, 1 keyword ('bỏ học', journey-0058), 1 grade-drop (journey-0042, HK2 GPA −1.2), …", with a link to the full markdown. That markdown is the readable queue for the trial; the administrator scans it and follows up on whatever warrants it through the normal flow.
  3. Daily keyword pass. A narrower watchlist-keyword-job sends an alert only on a non-empty day — the genuinely-can't-wait signals. Most days: no notification at all (silence = clear).

The job is fully automated; the administrator's judgement is the action. Nothing auto-resolves a signal — an alert stays until the administrator has dealt with it, the same way the gate-8 dignity check is never auto-set.

Legacy team-scale flow (struck; four-action triage by a separate safeguarding lead)

A separate Safeguarding lead opens /watchlist/<YYYY-MM>; each signal is a card with four buttons (Dismiss / Request Coordinator follow-up / Open escalation / Re-surface next month); each action writes a watchlist_item row. Restored when the team grows past one operator.

Flow E — Administrator Report generation → reader-friendly HTML

Trial shape (2026-05-29): the deliverable is a styled HTML page. The administrator (wearing the trustee / principal hat) reads a brand-styled HTML report — the same CSS family as these plan docs, readable on a phone, with the charts inline (8-petal coverage, grade-trend, benefits by month). Markdown + PDF are kept underneath as the regenerable audit copy, not as the thing anyone reads. The five-role audience filter collapses to one view (the administrator sees everything); the filter machinery returns with a second role.

End of quarter. The administrator opens the dashboard.

  1. Lands on the trial dashboard (one login). Sees the cohort summary, school list, every child by name + journey-id.
  2. Clicks Reports → "Per-cohort report". Picks programme Cùng em tiến bước, cohort year 2026, window "trailing 3 months". Generate.
  3. report-cohort-job launches. Reads the SQL (child, benefit, profile_entry, school_report for the cohort), calls the report-writer agent (Claude Opus) for the narrative, then renders three artefacts from one run: a reader-friendly HTML page (the primary, brand-styled, charts inline), plus markdown and PDF as the audit/archive copies, plus HXL-CSV for the financial cuts.
  4. All artefacts save to 10_Ho-so-thu-huong/Bao-cao/2026/2026-06-30__cohort__cung-em-tien-buoc-2026.{html,md,pdf}. The dashboard opens the HTML directly. ~10s SQL + ~15–30s Claude narrative.
  5. Per-school and per-child reports are the same backbone, different SQL view + template; the administrator picks the kind from the same Reports screen.

Reports are deterministic-given-state — same data, same SQL output; only the Claude narrative drifts slightly between runs. The rendered HTML+md+PDF are written to the Drive at generation time for audit.

Flow F — Sponsorship quarterly view

Trial shape (2026-05-29): parked. The sponsorship view is deferred with the rest of the Area-3 build — out of trial scope. The data model already supports it (it is a filtered read of the existing profile + ledger; nothing new to store), so this lights up later without rework. The original walkthrough is kept below for when Area 3 starts.
Deferred flow (Area-3 build)

End of quarter. Ông Nam sponsors journey-0042. The family has given named consent for first-name + photo sharing with this one sponsor (consent_observed.sponsor_named_share: given). The Coordinator generates the quarterly view.

  1. From the child profile, click Sponsorship view → pick ông Nam's sponsor id.
  2. sponsorship-render-job launches. Reads the profile + the benefits ledger filtered to funding_source = "2026-ong-nam", applies the sponsor audience filter (no sensitive context, names per consent_observed), produces a quarterly markdown narrative.
  3. Output lands in 01_Ho-so-thu-huong/Goc-nhin-nha-tai-tro/ong-nam/2026-Q2__journey-0042.md. The dashboard returns the file; the Coordinator reviews it, then personally sends it to ông Nam via a separate email step (Area-3 build).
  4. No auto-send. Same Phase-A discipline as Area 2's manual posting step.

Where the family declined named disclosure, the same renderer produces a journey-NNNN-addressed view (no name, no photo). Same depth of progress material; only the identifier differs.

7. Service + job mapping

Subplan tool / surfaceCloud Run homeNotes
tools/beneficiary_registry_migrate.py (B1, exists)one-shot job; run during C2 bootstrapSchema lives in GCS-mounted SQLite. Idempotent.
tools/beneficiary_validate.py (B1, exists)called in-process by promote-job and the dashboardUnchanged.
tools/beneficiary_promote.py (B2-scaffold, exists)intake-convert-job (was promote-job)Trial: scheduled + auto. Reads new drops, runs beneficiary-recorder (real AnthropicPromoteEngine, key in Secret Manager), auto-writes the Block-3 entry + profile_entry row. Flags to _needs-review.md on low confidence instead of writing. The "human clicks promote" path is the legacy form.
tools/school_doc_extract.py (B2-full, not yet)school-doc-extract-jobTrial: auto-commit on a clean, in-bounds extract (confirmed_by = auto); flags multi-child sheets + out-of-bounds values for a human. Claude Opus document input; Pydantic + range bounds per doc type.
tools/benefit_record.py (B3, not yet)called in-process by the dashboardValidates funding_source, category, petal. Writes Drive markdown + SQL row.
tools/benefit_report.py (B3)area4-emit-jobMonthly. Emits markdown + HXL-CSV (later IATI XML).
tools/beneficiary_index_rebuild.py (B4)index-rebuild-jobNightly. Three indexes + cohort dashboard.
tools/beneficiary_consistency_check.py (B4)consistency-check-jobNightly. Drive-vs-SQL drift report; alerts on findings.
tools/report_child.py (B5)report-child-jobOn-demand from dashboard.
tools/report_cohort.py (B5)report-cohort-jobOn-demand.
tools/report_school.py (B5)report-school-jobOn-demand.
tools/safeguarding_watchlist.py (B6)watchlist-monthly-job + watchlist-keyword-jobCron. Writes Drive markdown + SQL rows; pings Safeguarding via Chat.
tools/sponsorship_view_render.py (Area-3)sponsorship-render-jobDeferred until Area-3 starts.
app/ — the FastAPI dashboard (B7)hmt-care-dashboard serviceServer-rendered HTMX. Trial: one simple user/password login (not IAP + Drive-membership allowlist). One administrator view; no per-role audience filter.
NEWtools/profile_store.pylibrary; in every care jobHides Drive API behind a path-shaped interface. The seam that lets existing tools keep their filesystem-style calls.
NEWtools/iap_allowlist_sync.pyiap-allowlist-sync-jobTrial: deferred. One simple login replaces the IAP allowlist sync. Returns with a second user / role layer.
NEWtools/drive_membership_audit.pydrive-membership-audit-jobTrial: deferred (one operator, one folder). Returns when the Drive folder is shared beyond the administrator.
NEWtools/care_cache_refresh.pycare-cache-refresh-jobNightly. Pulls profile markdown from Drive into a GCS read-through cache so dashboard reads are fast.
tools/journey_leakage_check.py (subplan §10)called from media-line qc-jobThe Stream-C gate 8.5. Lives in the media-line image; takes a profile path in the Care folder as input.

8. Where data lives (Drive vs GCS vs SQLite)

SurfaceLives inWhy
Profile markdown (journey-NNNN.md)Care folder on the Foundation Shared Drive (canonical)Humans browse and (occasionally) edit directly via Drive UI. Append-only is enforced by tool discipline + Drive revision history.
Benefits ledger (markdown + receipts)Care folder on the Foundation Shared Drive (canonical)Same reason. Receipt PDFs/photos live in the per-child ledger folder.
Intake drops (Tai-lieu-vao/)Care folder on the Foundation Shared DriveWhere the Coordinator drops raw materials. EXIF/GPS strip is enforced at promotion time by media_sanitise.py --verify.
Reports (Bao-cao/<YYYY>/)Care folder on the Foundation Shared DriveGenerated by Cloud Run jobs, written back via Drive API. The Drive copy is the audit record.
Watchlist (Danh-sach-an-toan/)Care folder on the Foundation Shared DriveCron job writes there; Safeguarding lead reads from there or via dashboard.
Sponsorship views (Goc-nhin-nha-tai-tro/)Care folder on the Foundation Shared DriveCoordinator copies into email outside the workspace.
Indexes (Chi-muc/)Care folder on the Foundation Shared DriveRegenerable; the Drive copy lets a Coordinator open the index offline.
SQLite (so-dang-ky.sqlite)GCS ‹gcp-project›-care-db, mounted via gcsfuseProjection of the markdown. Performance need: dashboard queries cannot wait for hundreds of Drive API calls. Loss is annoying, not data loss.
Profile/ledger read-through cacheGCS ‹gcp-project›-care-db (separate prefix)Nightly mirror of profile markdown so dashboard reads are sub-second. Stale-tolerant; the dashboard offers a "refresh" action that re-reads from Drive on demand.
School-doc PDFs (post-extraction)Care folder (unchanged) + GCS render cacheThe Drive copy is the audit artefact; GCS holds a flattened-for-rendering version while the Coordinator is reviewing.
SecretsSecret Manager (shared with media line)No new drive-id secret — the one Shared Drive id is reused.
LogsCloud LoggingCare jobs tagged area=care for filtering; Safeguarding-touching reads (any access to a child profile by a non-Coordinator role) get a dedicated audit log sink.
Why not put the markdown on GCS too? Two reasons. (1) Drive is where humans live; trustees and principals need to be able to read a profile from a Google Doc app on their phone without going through the dashboard. (2) Drive's revision history is the audit trail; GCS object versioning is shallower for append-only text. Drive wins for the canonical store; GCS is just a performance cache.

9. When things run

Cron (Cloud Scheduler — the dominant pattern here)

Push-triggered (event-driven, optional)

On-demand (dashboard-initiated)

Never (Phase-A invariant for Area 1)

10. The dashboard service (B7 expanded)

The dashboard is the most substantial new build. The subplan (§B7) gave it 5–8 days; that estimate is pre-cloud and assumes a localhost FastAPI. On Cloud Run behind one simple login, plan ~10 days for the first cut, ~3 more for polish after the pilot.

Trial shape (2026-05-29). The Role(s) column and the audience filter below are the team-scale design, kept for the record. In the one-administrator trial there is one role — the administrator sees every page and every field. The promote/extract write endpoints are not click-driven either (intake auto-converts; see Flows A/B); the dashboard's write surface is benefit-record + report generation + reading the watchlist alert. The /audit/* pages are deferred (one viewer).

Pages and views (Role column = team-scale target; trial = one administrator sees all)

PathRole(s)What it shows
GET /anyHome: cohort summary, today's intake count, watchlist count for safeguarding role only.
GET /search?q=<name>anyReal-name lookup; reads by-child/_search.md from the read-through cache. Returns journey-id + school.
GET /child/<journey-id>any, audience-filteredFull profile: Block 1 (audience-filtered), Block 2 programme history, Block 3 progress notes, benefits ledger, school-report grade-trend mini-chart, sibling cluster, followups outstanding.
GET /school/<slug>coord, sg, principal, trustee, editorSchool page: every child in this school, grouped by programme, with status + last-update + 8-petal coverage miniature.
GET /cohort/<programme>/<year>coord, sg, trustee, editorCohort page: count, school spread, gender mix, age range, status distribution.
GET /watchlistsafeguarding onlyCurrent month's triage queue. Action buttons per item.
GET /watchlist/<YYYY-MM>safeguarding onlyHistorical watchlist snapshots.
GET /reportscoord, sg, principal, trustee, editorThree generators: per-child, per-cohort, per-school. Parameter pickers.
GET /intakecoordinator onlyCurrent week's raw drops grouped by journey-id. Review and promote buttons.
POST /record/promote/<journey-id>coordinator onlyCommits a draft profile entry. Launches promote-job if not already drafted.
POST /record/extract/<file>coordinator onlyLaunches school-doc-extract-job; returns extracted fields for side-by-side review.
POST /benefit/recordcoordinator onlyWrites the ledger entry + SQL row.
POST /watchlist/<item-id>/actionsafeguarding onlyTriage action (dismiss / followup / escalate / re-surface).
POST /sponsorship/<sponsor-id>/rendercoordinator onlyLaunches sponsorship-render-job.
POST /report/<kind>any (audience-filtered output)Launches the relevant report job; returns when complete.
GET /audit/meanyThe viewer's own audit trail (what they have accessed in the last 90 days). Trust-by-transparency.
GET /audit/role/<role>safeguarding onlyCross-user audit; what each Coordinator / Principal / Trustee has accessed.

Stack choices

Audit logging

Every read of a child profile by a non-Coordinator role writes one line to the audit log:

{ "ts": "2026-06-15T08:32:14+07:00",
  "viewer": "bac.binh@hmtfoundation.org.vn",
  "role": "trustee",
  "path": "/child/journey-0042",
  "fields_visible": "trustee_filter",
  "ip": "..." }

The Safeguarding lead can see this audit at /audit/role/trustee. Trust-by-transparency — the people whose data is in the system can be shown that access is recorded.

11. Phase-A invariants for this subsystem

Inherit the media-line invariants (no channel-publisher cron, gate-8 manual, etc.). On top of those:

Trial reconciliation (2026-05-29): Phase A lives at the OUTPUT boundary, not at every internal write. The auto-convert intake (Flows A & B) does write profile entries and school_report rows unattended — that is the point of the relaxation. This is consistent with Phase A because the internal beneficiary memory is not an egress: nothing about a child leaves the workspace by a note being auto-written to a private profile. The bright line is unchanged at the place data actually leaves — a Facebook post, a sponsor email, a website figure — all still gated by /approve. The legacy "no automated profile write" invariant below is the team-scale form, restored when a second person joins.

12. Migration plan, in phases

Assumes the media-line cloud migration is done first (it provides the project, the Artifact Registry, Secret Manager, the base image, IAM patterns). If Area 1 goes first, fold the media-line C0–C2 into the front of this plan.

D0 — Pre-flight (1 day) — trial

D1 — The Drive seam (2 days)

D2 — Bucket + migration on the existing SA (1 day) — trial

D3 — School-doc extraction (3 days)

D4 — Cron jobs + watchlist (4 days)

D5 — Reports (4 days)

D6 — The dashboard, first cut (8 days)

D7 — Allowlist sync + audit + domain (3 days)

D-zero — Operational pilot, parallel

Per subplan §B-zero: one school, one Coordinator, five real children, hand-rolled profiles + five weeks of intake, walked end-to-end on the pre-cloud setup first. Do this before D3 lands. Findings feed the dashboard's first cut. The pilot proves the operational discipline, not the cloud infrastructure.

Total: ~27 working days of build, plus the pilot operational time. Most of it is the dashboard (D6) and the watchlist/reports/extraction trio (D3–D5). Pure cloud plumbing is ~5 days (D0–D2 + D7).

13. Cost estimate

Steady state, six partner schools, ~150 children active across the cohort, ~30 benefit rows/month, ~15 progress notes/week, monthly watchlist + daily keyword pass:

LineEstimate / monthNote
Cloud Run service hmt-care-dashboard~$10–15Same shape as media webhook. Single administrator, so min-instances can be 0 for the trial (cold start is acceptable for one user) or 1 for snappiness.
Cloud Run jobs (care side)~$3Nightly + daily + monthly + auto-convert. Sub-cent per invocation.
GCS storage (care)~$1SQLite + cache + render staging; small.
Login (trial)$0One simple user/password. (Team-scale Cloud IAP is also $0 for Workspace accounts.)
Cloud Logging<$1Standard job logs. (Per-role audit sink deferred — one viewer.)
Cloud Scheduler<$1~10 entries.
Anthropic API (care-side)$15–40Promote (small prompt) + school-doc extract (PDF document input) + report-writer narratives + watchlist keyword tag. Dominated by school-doc extract during heavy school months (May/Dec for hoc-ba).
Care-side total~$35–60

On top of the media-line ~$20–25/month + ~$10–30 Anthropic, the all-Foundation monthly is ~$65–115. Order of magnitude smaller than any SaaS case-management tool, and the Foundation owns the code and the data.

14. Risks and rollback

Risks specific to Beneficiary Care

Rollback per phase

15. Open questions for the Editor

Resolved by the 2026-05-29 Editor review (see the override banner): one Workspace + one Shared Drive (Care = restricted folder, no second Drive); one SA (hmt-media-ingest, no care@, no hmt-care-sa); one administrator (no five-role model, no role_assignments table, no audience filter); one simple user/password login (no IAP allowlist sync); dashboard on the default Cloud Run URL for the trial; Flows A & B auto-convert; Flow D is an alert; Flow E renders HTML; Flow F parked. The questions below are the ones still genuinely open.

  1. Auto-convert failure threshold. How aggressive should "low confidence → flag a human" be? Too loose risks a bad entry in memory; too tight defeats the labour-saving. Tune during the pilot.
  2. Alert channel for Flow D + the failure flag. Email? A dashboard banner? Zalo/Telegram? Not yet pinned for the Care side.
  3. Drive change push for intake. Optional — the scheduled convert run works without it. Recommend leaving it off at launch; turn on later if the lag annoys.
  4. Anthropic budget cap. The care-side adds $15–40/month at steady state, spiking in school-doc-heavy months. Confirm headroom on top of the media-line cap. (Carried from the original list.)
  5. When does this happen?
    • Subplan order: B1 done, B2-scaffold done. The natural next builds are B2-full (school-doc extract) → B3 (benefits) → B4 (indexes) → B5 (reports) → B6 (watchlist) → B7 (dashboard).
    • The D-zero pilot is the unlock for any cloud commit. Recommend: D-zero on the local pre-cloud setup first, then choose between (a) finishing B2–B7 on local, then a single cloud push, or (b) D2–D4 cloud plumbing in parallel so the dashboard ships natively cloud-hosted. The (b) path is faster to a usable trustee dashboard; the (a) path is lower-risk.
    Editor's call.

16. Glossary delta

Read the media-line glossary first. New here:

Care folder (10_Ho-so-thu-huong/)
Trial: a permission-restricted folder on the one Foundation Shared Drive holding every child's record. Folder-level permission to the administrator is the access boundary. (Team-scale target: a separate private Shared Drive with its own membership list; deferred.)
Audience filter (§6.5 of the subplan) — deferred
A render-time narrative filter that limits what a report or profile page includes based on the viewer's role. Not used in the one-administrator trial (one viewer sees everything); returns with a second role.
Cloud IAP allowlist sync — deferred
Team-scale: a daily job that reads the Drive's member list and writes it to the dashboard's IAP allowlist. The trial uses one simple user/password login instead.
Read-through cache
A GCS mirror of the canonical markdown on Drive, refreshed nightly. Lets the dashboard answer queries in <100 ms instead of waiting on Drive API. Stale-tolerant.
Audit log sink — deferred
Team-scale: a Cloud Logging sink capturing every read of a child profile by a non-Coordinator role, for the Safeguarding lead's cross-user review. Not needed in the one-administrator trial (one viewer); Drive revision history already records writes.
Profile store
A small library (tools/profile_store.py) that hides Drive API calls behind a path-shaped interface. Lets the existing tools keep their filesystem-style reads / writes without rewriting to a Google API client.
Gate 8.5
The journey-leakage check (tools/journey_leakage_check.py), specific to Stream-C pieces sourced from a real profile. Asserts no Block-1 identifying field survives into the published draft.