Skip to main content

EasySLR — Article Review Workspace

A self-contained slice of a systematic-literature-review tool: users work inside organizations → projects, import articles from a PubMed-style Excel file, and review them in a table-driven workflow (search, sort, filter, decisions, notes, bulk actions, export).

Engineering decisions are documented as ADRs in /docs. This README is the orientation; the ADRs are the depth.


Quick start

Prerequisites: Node 20+, pnpm, Docker (for local Postgres).

pnpm install
cp .env.example .env                 # AUTH_SECRET already works for local; or: npx auth secret
docker compose up -d                 # Postgres 16 on host port 5433
pnpm prisma migrate dev              # apply migrations + generate client (+ runs the seed)
pnpm db:seed                         # (idempotent) ensure demo data exists
pnpm dev                             # http://localhost:3000

The local DB uses host port 5433 to avoid clashing with any Postgres already on 5432.

…or run the whole thing in Docker

The app is containerized too (Dockerfile), so you can bring up app and database together:

docker compose up -d --build      # app on http://localhost:3000, Postgres alongside it

The container runs prisma migrate deploy + seed on startup and connects to the DB over the compose network (db:5432). AUTH_SECRET is a throwaway local value in docker-compose.yml — inject a real one from a secrets manager in production. (For a smaller image, switch to Next's output: "standalone" multi-stage build; the single-stage image here favors reliability and clarity.)

Demo accounts (all password password123)

EmailRoleCan
owner@demo.testOWNERimport, undo, manage + everything below
reviewer@demo.testREVIEWERread + write review decisions
viewer@demo.testVIEWERread only (writes are blocked server-side)

All three are in the Demo Research Lab org / Telehealth Systematic Review project. A copy of the provided sample_article_import.xlsx lives at test/fixtures/ — import it as owner.

Scripts

pnpm devrun the app
pnpm testVitest (validation, parse failures, access, idempotency, conflict)
pnpm typecheck / pnpm lintstrict TS / ESLint
pnpm buildproduction build
pnpm db:studioPrisma Studio

Architecture

Stack: Next.js 15 (App Router) · TypeScript · tRPC v11 · Prisma 6 / PostgreSQL · NextAuth v5 (Credentials + JWT) · Tailwind v4 · Vitest. Scaffolded with create-t3-app.

The one rule that shapes everything: business logic — import validation, dedup, authorization — lives in pure functions under src/server/services/, free of tRPC/React/Next. That's what makes it readable, unit-testable against the real sample file, and reusable.

HTTP / UI  →  tRPC router (thin)  →  service (pure logic)  →  Prisma  →  Postgres
                    ↑ authz middleware re-derives access from the DB on every call
src/
├─ app/                      App Router pages (login, orgs, projects/[id]) — thin
├─ server/
│  ├─ auth/                  NextAuth Credentials config + requireSession gate
│  ├─ api/
│  │  ├─ trpc.ts             procedures + the authorization ladder
│  │  └─ routers/            org · project · import · article · review · export · health
│  └─ services/
│     ├─ access.ts           requireOrg/Project/ArticleAccess, assertCanWrite/Owner
│     └─ import/             mapHeaders · parse · normalize · validate · classify  (pure, most-tested)
├─ lib/                      review vocab, csv, slug
└─ components/               table, badges, header
prisma/   schema · migrations · seed
docs/     ADRs · bug ledger · scope ledger · AI usage
test/     pipeline · units · integration (+ fixture)

Domain model & authorization

Organization → Project → Article → Review, with OrganizationMembership and ProjectMembership join tables. See prisma/schema.prisma.

  • Article (immutable import data) and Review (mutable workflow state) are separate tables — the import path only writes Article, the review path only writes Review. (ADR-0004)
  • Authorization is enforced server-side, not via hidden UI. A ladder of tRPC procedures (protected → org → project → article) re-derives the caller's membership from the DB on every request and attaches the verified entity to context; writes additionally gate on role; absent-or-forbidden both return NOT_FOUND so IDs can't be probed. (ADR-0003)
  • Review keyed (articleId, reviewerId) with a unique-per-article constraint today → dual independent (PRISMA) screening is a one-line constraint change later. Reviewer FK is onDelete: Restrict so a departing reviewer's decisions never vanish.

Import validation choices (the centerpiece)

The provided sample_article_import.xlsx is effectively a validation test harness — every odd row is a deliberate edge case. The pipeline (mapHeaders → parse → normalize → validate → classify) handles each with an explicit, defensible policy. Full reasoning in ADR-0001 and ADR-0002.

Sample rowWhat it testsHow we handle it
Blank Titleincomplete recordReject (title + ≥1 identifier required)
DOI collides w/ another row, different PMIDis it a duplicate?Import + flag possibleDupDoi — never silently deleted
Year "Twenty twenty" / 2035bad metadataImport, year = null, flag suspectYear, keep original text
Duplicate PMIDtrue duplicateHard-skip the later occurrence
Blank PMID but valid DOI + titlepartial identityImport (the "≥1 identifier" rule earns its keep)
" 38910023 ", " DOI:10.1000/NQ... "messy but validNormalized → imported

Key decision — PMID is the only hard dedup key. Two different PMIDs sharing a DOI is a data-quality signal, not a confirmed duplicate, so DOI/title collisions are imported and flagged for a human, never auto-deleted. (Postgres treats NULLs as distinct in unique indexes, which gives the (projectId, pmid) constraint correct "skip duplicate non-null PMIDs, allow many null-PMID rows" behaviour for free.)

Two-phase, tamper-safe import: one pure pipeline, exposed via import.process({ dryRun }). dryRun: true previews (no writes); dryRun: false re-runs the same code and commits. The server always parses the file — the client never sends row data — so preview and commit can't disagree, and re-importing the same file is naturally idempotent (createMany … skipDuplicates).


Review workflow

After import, the table is the workspace: server-side search / sort / filter / paginate (the client never holds the whole dataset — ADR-0005), data-quality flag badges, and a detail drawer to set a decision (Include / Exclude / Maybe) with an exclusion reason (PRISMA), priority, notes, and labels. Bulk actions apply a decision to selected rows; CSV export downloads the (filtered) set; undo import removes a batch (guarded if any of its articles are already reviewed). Concurrent edits use optimistic concurrency (updatedAt) → a stale write gets CONFLICT, not a silent clobber. Loading / empty / no-results / error states are all handled.


Tests

pnpm test — focused on behaviour that matters (26 tests):

  • import.pipeline — the real sample_article_import.xlsx end-to-end; asserts every edge case above + summary counts.
  • import.units — normalize / validate / mapHeaders / dedup.
  • api.access (integration, real DB) — non-member → NOT_FOUND, VIEWER write → FORBIDDEN, import idempotency, review CONFLICT.

Verified end-to-end against a running production build: demo login sets a session and a protected tRPC call returns correctly scoped data.


Tradeoffs & known gaps

Honest list with severity/status in docs/bugs/BUGS.md; deferred scope in docs/decisions/scope-ledger.md. Highlights:

  • Author search is limited — authors are a raw "Rao A; Chen L" string; firstAuthor + title search mitigate.
  • No decision history / audit log — a single mutable Review row; an append-only event log is the planned add.
  • Single shared decision (not dual-reviewer) — modeled for it, not built.
  • JWT can outlive a revoked membership — mitigated by the per-call DB membership check; data access is cut immediately.
  • Offset pagination — fine at this scale; keyset is the seam.
  • No-PMID rows can't be hard-deduped — they re-import on a second run (flagged); a faithful consequence of the PMID-only dedup rule.

Deployment

Free path — Vercel + Neon (recommended)

The repo is deploy-ready for Vercel (app) + Neon (free serverless Postgres) — no VPC, no cost.

  1. Neon → create a free project → copy the pooled connection string and the direct one.
  2. Vercel → import this GitHub repo → set env vars:
    • DATABASE_URL = Neon pooled url (append ?sslmode=require if not present)
    • DIRECT_URL = Neon direct url
    • AUTH_SECRET = output of npx auth secret
    • AUTH_TRUST_HOST = true
  3. Migrate + seed the Neon DB once (from your machine, with the Neon URLs in .env):
    pnpm prisma migrate deploy && pnpm db:seed
    
  4. Vercel builds and gives you a public URL. Sign in with owner@demo.test / password123.

Serverless specifics already handled in the repo: Prisma generates the Linux engine (binaryTargets = ["native", "rhel-openssl-3.0.x"]); migrations use directUrl (bypassing the pooler) while the app uses the pooled DATABASE_URL; and next.config.js bundles the Prisma engine into the serverless functions.

AWS alternative — SST

For the AWS-preferred path: SST (OpenNext) → Lambda + CloudFront; Aurora Serverless v2 behind RDS Proxy (serverless-Postgres connection-pooling); secrets in SST Config/SSM; prisma migrate deploy via an sst tunnel; S3 presigned upload + SQS worker as the seam for large imports; CloudWatch logs.


AI usage

Full disclosure in docs/decisions/ai-usage.md. In short: built with an agentic Claude coding assistant for design, scaffolding, code, and docs. I personally verified the data model and authorization paths, profiled the sample file row-by-row to confirm each edge case lands correctly, and confirmed migrations/typecheck/tests/build all pass. One example I changed: an early AI-proposed plan deduplicated on DOI as well as PMID — I rejected it because the sample's row 5 is a distinct article that DOI-dedup would silently delete, and changed the design to PMID-hard / DOI-soft-flag.

Approximate time spent

~10 focused hours (design + adversarial design review, implementation, tests, docs).

What I'd improve next

Dual independent screening + conflict resolution (the model already supports it); an append-only decision audit log; author normalization for real author-level filtering; keyset pagination + pg_trgm/tsvector search at scale; and an actual SST deployment with RDS Proxy.