Skip to main content

Architecture Overview

Diffy is built as a monorepo with two deployable applications and several shared packages.

System Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│ User Browsers │
└─────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│ CDN / Edge │
│ (Vercel) │
└─────────────────────────────────────────────────────────────────────────┘

┌───────────────────────┴───────────────────────┐
▼ ▼
┌───────────────────────────┐ ┌───────────────────────────┐
│ Web Application │ │ Public API │
│ (Next.js on Vercel) │ │ (Hono on Railway) │
│ │ │ │
│ - Dashboard UI │ │ - REST endpoints │
│ - Internal API routes │ │ - External developers │
│ - Cron triggers │ │ - API keys │
│ - Slack OAuth │ │ │
└───────────┬───────────────┘ └───────────┬───────────────┘
│ │
└─────────────────┬─────────────────────────┘


┌───────────────────────┐
│ PostgreSQL (Neon) │
│ │
│ - Domains │
│ - Pages │
│ - Snapshots │
│ - Changes │
│ - Users/Orgs │
└───────────────────────┘

Applications

Web (apps/web)

The main user-facing application built with Next.js 14 (App Router).

Responsibilities:

  • Dashboard UI for viewing domains, changes, and alerts
  • Internal API routes (/api/v1/*) for dashboard operations
  • Cron job triggers (/api/cron/*)
  • OAuth flows (Clerk, Slack)

Deployment: Vercel

API (apps/api)

Public REST API for external developers, built with Hono.

Responsibilities:

  • External API access
  • API key authentication
  • Rate limiting per API key

Deployment: Railway

Packages

PackagePurpose
@diffy/dbDrizzle ORM schema and database access
@diffy/crawlerWeb crawling and diff detection logic
@diffy/alertsNotification system (email, Slack)
@diffy/coreShared business logic
@diffy/uiShared React components

Data Flow

Adding a Domain

1. User submits domain → Web API
2. Insert domain (status: pending) → Database
3. Enqueue discovery job → Redis Queue
4. Worker picks up job → Crawler Worker
5. Discover pages → Playwright
6. Insert pages → Database
7. Update domain (status: active) → Database

Scheduled Crawling

1. Cron trigger fires → Vercel Cron
2. Mark domains needing crawl → Database
3. Worker polls for work → Crawler Worker
4. Crawl each page → Playwright
5. Compare content hash → Worker
6. If changed:
a. Save new snapshot → Database
b. Record change → Database
c. Send alerts → Email/Slack
7. Update last crawl time → Database

Security Model

Authentication

  • Users: Clerk handles user auth (OAuth, email/password)
  • Organizations: Multi-tenant with org-based data isolation
  • API: Session cookies for web, API keys for external

Authorization

  • Users can only access their organization's data
  • All database queries filter by orgId
  • Row-level security enforced at application layer