Clip Pipeline Wiki

// LLM knowledge base for the portrait-captions + pipeline-viz project

How to use this wiki (for humans and LLMs) Every concept in this project has one node. Each node has a stable #id, a type badge, a one-paragraph description, and chips linking to what it depends on and what depends on it. Open any link in a new tab — anchor navigation preserves context. When an LLM needs to reason about this project, load this page as context rather than scanning the whole repo: it's the distilled graph of what exists, what it's for, and how it connects.
                                  ┌───────────────┐
                                  │   ApertureDB  │ ← catalog · dedupe · metrics
                                  └───┬───┬───┬───┘
                                      ▲   ▲   ▲
┌──────┐ ┌──────┐ ┌───────┐ ┌──────┐ │ ┌──────┐ │ ┌─────────┐ │ ┌──────┐
│source│→│select│→│extract│→│trans.│→│→│captn.│→│→│composite│→│→│drive │
└──────┘ └──────┘ └───────┘ └──────┘   └──────┘   └─────────┘   └──────┘
 yt-dlp   twelve   ffmpeg    whisper    PIL       moviepy+x264   gws drive

 STATIONS in the 3D dashboard @ http://100.82.244.127:5173 (claw, Tailscale)

Stations

sourcestation

Acquire the long-form recording (Twitch VOD, podcast, local .mp4) via yt-dlp or manual copy. First node in the pipeline.

selectstation

Pick the top 5 moments. LLM rank over whisper transcript, or TwelveLabs semantic search. Output: moments/{slug}.json.

extractstation

Ffmpeg filter chain: face 260×260 @ (1370,790) → 1080×1080 top, screen 1340×1080 → 1080×840 bottom, vstack to 1080×1920.

transcribestation

openai-whisper with word timestamps. Small on first pass (real-time); medium on iteration (~3× slower, better proper nouns). Output: _transcripts/{slug}.json.

captionsstation

PIL rasterizes each cue to transparent PNG. Georgia Bold Italic 60pt, #F2D21B with 4px black stroke. Chunk ≤5w / ≤2.5s / ≤34ch.

compositestation

moviepy + libx264 CRF 20. Layers portrait clip + cover bar (y=1700, opacity 1.0) + caption PNG. Not Remotion — iCloud FUSE blocks SSR.

drivestation

Google Drive upload via gws CLI. First pass: drive files create. Iteration: drive files update --upload preserves fileId + webViewLink.

apertureDbstation (sidecar)

Vector + metadata catalog. Every clip gets a row with embedding + transcript + drive link. Enables dedupe and cross-session search. Bidirectional: select queries, composite writes, drive backfills.

Services

vite (dashboard)service

Three.js dashboard on claw:5173. Renders the 8 stations + billboards carrying real first-frame stills along connection curves.

ws.tsservice

Node + chokidar file watchers on claw:8787. Broadcasts PipelineEvent JSON over WebSocket. Exposes /thumb?clip=&t= for on-demand ffmpeg stills.

catalog.pyservice

Python HTTP bridge inside docker (VM:8788 → tunneled to claw:48788). Endpoints: /catalog, /backfill, /schedule, /schedule/list, /schedule/update, /assist/meta.

aperturedb (docker)service

aperturedata/aperturedb-community on claw (amd64 under QEMU). Ports 55555 (client) + 8788 (catalog shares ns). Community credentials admin/admin.

colimaservice

Docker runtime on claw via a Linux VM. 4 CPU / 6 GiB / 60 GB disk. Started with --mount /Volumes/claw-fast:w after NVMe migration.

tunnel.sh (launchd)service

SSH reverse forward claw host → colima VM. Maps claw:45555 → aperturedb and claw:48788 → catalog. Needed because lima guestagent doesn't auto-forward.

youtube_poster.pyservice

Launchd-ticked worker (60s) that drains ScheduledPost where platform=youtube AND scheduled_at <= now. Uploads via Data API v3 with resumable MediaFileUpload.

credentials.envservice · config

Gitignored env file at pipeline-viz/server/credentials.env. YT_CLIENT_ID, YT_REFRESH_TOKEN, META_APP_ID, META_LONG_LIVED_USER_TOKEN, INSIGHTS_POLL_MINUTES. Template at credentials.example.env.

ApertureDB entities

Sessionentity

One per long-form recording. session_id, source_url, long_form_path, duration_s, ingested_at.

Clipentity

One per rendered clip. clip_id, clip_code, batch_code, slug, start_s/end_s, grade, moment_score, caption_text, hook_text, script_text, cta_text, principle, drive_url, drive_file_id, layout, status, path, and meta_url/youtube_url/tiktok_url.

Speakerentity

Known speakers with color assignment. Seeded: jordaaan #f2d21b, colin #ff8c00, steven #00e676. Linked to Clip via FEATURES connection.

Batchentity

One per theme. batch_id, name, principle_tag, cta_url, source_sheet_tab. Groups clips for insights-per-batch.

Metricentity

One row per platform per clip per snapshot. clip_id, platform, captured_at, views, likes, comments, shares, saves, watch_time_s, ctr.

ScheduledPostentity

Queue for posts. post_id, clip_code, platform, mode (auto|assist), scheduled_at, status, caption, hashtags, thumbnail_path, result_url, error.

clip_marengo_v1descriptor set

1024-d cosine-similarity HNSW index. TwelveLabs Marengo embeddings per clip. Used for dedupe (sim ≥ 0.92 = likely duplicate) and cross-session semantic search.

Events

source_acquiredevent

Fires when a file lands in workspace/source/*.mp4. Billboard travels source → select.

moments_selectedevent

Fires when a JSON lands in workspace/moments/*.json. Billboard travels select → extract.

segment_extractedevent

Fires per portrait clip created by ffmpeg. Currently not yet wired; planned on clips/{session}-portrait-ff/*.mp4 raw-drop.

transcribe_completeevent

Fires on workspace/_transcripts/*.json add. Billboard extract → transcribe.

captions_renderedevent

Reserved for PIL caption PNG emission. Not yet emitted — planned on caption_pngs/ sentinel.

composite_completeevent

Fires on clips/{session}-portrait-ff/*-ff.mp4. Billboard captions → composite. Triggers /catalog POST → creates Clip row.

drive_upload_completeevent

Fires on *-ff.replaced sentinel. Reads sibling *.replaced.json if present for drive_url/drive_file_id. Backfills the Clip row.

catalog_writeevent

Emitted twice per clip: once after composite (Clip created), once after drive upload (Drive link backfilled). Billboard travels composite/drive → apertureDb.

dedupe_hitevent

Reserved for dedupe gate after Select. Fires when proposed moment's embedding matches an existing Clip at cosine ≥ 0.92. Not yet wired.

Conventions

clip_codeconvention

Short ID scheme: {BATCH}-{NN}{variant}, e.g. CAPI-03a. 2–4 letter batch tag, zero-padded sequence, single variant letter. Easy to say aloud, natural ordering, grep-friendly.

batch_codeconvention

2–4 letter theme tag (CAPI, CRM, GA4, CM2, ATTR). Inspired by Content Matrix tab names without replicating its full column scheme.

Layout A (default)convention

Face 1080×1080 top, screen 1080×840 bottom, 1080×220 cover bar at y=1700. Alternates B (pure screen), C (blurred bg), D (floating card) render after layout A completes.

layout-d-default-alternateconvention · feedback

After layout A completes, always render top-3 ranked clips in layout D as alternates. No ask.

speaker-aware captionsconvention

Multi-speaker clips get color-coded captions per speaker. Mandatory user-verification checkpoint on speaker→identity mapping (LLM bootstrap is unreliable).

AI Salon speakersconvention

Wide-shot: left = jordaaan yellow #f2d21b, right = colin neon navy blue #3d5afe. Apply for any AI Salon clip.

HICAM opt-in onlyconvention

HICAM pipeline triggers only on multi-ISO Drive folder or explicit "HICAM" in user request. YouTube/Twitch URLs always go through standard portrait flow, never HICAM.

cover-bar opacity = 1.0convention · pitfall

Cover bar opacity must be exactly 1.0. At 0.9, landscape v1 captions ghost through visibly. Hard-coded, no knob.

replace, never re-uploadconvention · pitfall

gws drive files create mints a new fileId and breaks every share link. Always use files update --upload for iteration. Always cd first — gws refuses paths outside cwd.

pipeline-viz/workspace/filesystem

Canonical watch tree on claw. Layout: source/ · moments/ · _transcripts/ · clips/{session}-portrait-ff/*-ff.mp4. After NVMe migration, symlinked to /Volumes/claw-fast/pipeline-viz/workspace/.

Commands

enqueue YouTube postcommand

curl -X POST http://localhost:48788/schedule -H 'content-type: application/json' -d '{"clip_code":"CAPI-03a","platform":"youtube","scheduled_at":"...","caption":"...","hashtags":"#shorts"}'

Meta assist payloadcommand

curl -X POST http://localhost:48788/assist/meta -d '{"clip_code":"CAPI-03a"}' → returns caption + hashtags + thumbnail_path + instagram://camera deep link for manual posting.

claw-fast-migrate.shcommand

~/pipeline-viz/server/claw-fast-migrate.sh {status|phase1..phase7|verify|all}. Phased migration: format → tree → workspace → thumbs → aperturedb → remotion cache → nightly backup.

health probescommand

curl http://localhost:8787/health (ws.ts) · curl http://localhost:48788/health (catalog) · launchctl list | grep pipeline-viz (agents).

on-demand thumbnailcommand

GET http://claw:8787/thumb?clip=<abs-path-to-mp4>&t=<seconds> → ffmpeg-extracted JPEG, cached by clip+timestamp. Used by the dashboard billboards.

find clip in ApertureDBcommand

docker run --rm --network container:aperturedb python:3.12-slim sh -c 'pip install -q aperturedb && python -c "..."' — one-liner FindEntity by clip_code. Used when the catalog HTTP bridge isn't enough.

Machines (Tailscale cluster)

clawmachine · render worker

M4 Mac mini @ 100.82.244.127, user claw. Dedicated heavy-lift / 24-7 worker. Runs vite, ws.ts, docker (aperturedb + catalog), launchd agents (tunnel, youtube_poster).

jordanmachine · primary

M4 Mac mini @ 100.86.248.8, user jordaaan. Primary workstation. iCloud-hosted project repo. Renders offloaded to claw via ssh + rsync.

mbpmachine · portable

MacBook Pro @ 100.116.140.93, user supabowl. Portable secondary. Potential future co-renderer reading source VODs from claw over Tailscale.

Claude Code skills

portrait-foreignfilm-clipsskill

End-to-end first pass (stages 1–7). scripts/transcribe.py · scripts/render_portrait_ff.py · scripts/upload_clips.sh.

caption-quality-boostskill

Iteration loop (stages 4b → 6 → 8). Re-transcribe with medium, re-render, in-place Drive replace preserving webViewLink.

caption-refinementskill

Autoresearch loop using eval_caption_timing. Iterate until movie-quality thresholds are met on onset/offset/phrasing/duration/gaps.

hicamskill · multi-iso

Multi-ISO podcast clipping with ISO-grade + per-speaker routing. Opt-in only; standard portrait flow never uses it.

best-clipsskill

Rank long-form stream moments for short-form clipping based on coding activity, compile-like screen motion, transcript hints.

twelvelabs-clip-pipelineskill

TwelveLabs MCP (search, analyze, status) + REST (upload) + ffmpeg (extract, crop) + mlx-whisper (transcribe) + Remotion SSR.

Guides (CF Pages)

portrait-captions-guideguide

Canonical 8-stage workflow. Source of truth for pipeline design. Contains #dashboard and #posting sections.

posthog-autoresearch-guideguide

Analytics for agent loops: queries, logs, evals, traces. Bridges to Token Machine.

nvme-upgrade-guideguide · runbook

TB4 + 2 TB NVMe on claw. Phase-by-phase migration, rollback, pitfalls. Orange.

organized-ai-hubguide · index

The meta-index of every deployed CF Pages project under bluehighlightedtext.com. Organized AI's front door.

tailnet-guideguide

Tailnet topology and ACLs for the jordan/mbp/claw cluster. Referenced by NVMe guide's Tailscale impact section.

token-machine-guideguide

Autonomous token routing, cost tracking, efficiency optimization. PostHog-guide has a bridge section.

Reference memories (auto-loaded per session)

reference_portrait_captions_guidememory · ref

Mirror of the canonical 8-stage workflow for in-session recall.

reference_aperturedb_clawmemory · ref

ApertureDB on claw: container state, port-forward caveat, recreate commands.

reference_content_matrixmemory · ref

Hook# / Script# / CTA Letter / Principle scheme from the Google Sheet. Source of the clip_code naming inspiration.

reference_local_only_architecturememory · ref

Cluster-native processing (claw + jordan + mbp) with Ollama replacing Claude API, NFS-shared pool/, file-based queue, Drive as opt-in distribution.

feedback_layout_d_default_alternatememory · feedback

After layout A completes, always render top-3 in layout D as alternates. No ask.

iCloud FUSE breaks Remotionmemory · gotcha

ETIMEDOUT / Error-70 when Node/Chromium reads iCloud Drive files. Stage project to /tmp/remotion-render-staging/. Why this pipeline uses moviepy, not Remotion.

insights cadenceconvention

Poll metrics every 15 min for first 24h after post, then every 60 min. Configured via INSIGHTS_POLL_MINUTES and INSIGHTS_FIRST_24H_POLL_MINUTES.