Two portable skills. One repo. music-video shorts · Korean job-board digest · $0 runtime · EN + KO
Skill #1 (music-video): a music track becomes a 9:16 vertical short — phrase-aware cuts, variable-speed mood pacing, onset-aligned glitch micro-edits, vintage lo-fi shaders.
Skill #2 (job-hunt): a single seed keyword like "Problem Solver" expands to the full family of equivalent titles (FDE / Applied AI Engineer / Generalist / …) and produces a deduplicated digest of matching Korean job postings.
Mechanical stages stay local (aubio, whisper.cpp, ffmpeg, jq). Creative stages opt into Claude (Tier-1) — covered by the operator's existing subscription, no incremental USD. The system audits itself on every commit.
5-second slice of the first uploaded music-video short — Suno-generated lo-fi jazz track, B-roll from Pexels, render + shader pass by local ffmpeg.
60+mission outputs
0runtime API tokens
3audit layers (L1+L2+L3)
5mission types
2portable skills (agentskills.io)
EN+KOdual track from day 1
MITlicense
How it works
The scaffold is general-purpose. Short-form video is the v1 mission domain — chosen because the deliverable is visually verifiable and the failure modes are quick to catch. Current focus is the music-video mission (music-as-sole-audio, beat-aligned cuts, onset-aligned glitch micro-edits, vintage lo-fi shaders) — operator-validated as the production format after eight narration-driven trials. The earlier faceless-short mission (narration-driven) and the v1 highlight / shorts-batch missions remain in the tree as alternate paths.
Pipeline (music-video mission)
Beat extraction.aubiotrack finds real beats; sub-beat noise rejected. Cuts land every Nth beat (default 12 — about one cut per 7.5 s at 95 BPM).
Phrase alignment.aubioonset detects drum hits. Variable per-clip setpts by mood: slow scenes 0.55×, ambient 0.70×, active 0.80×, natural 1.00× — the music drives the visual pace.
B-roll. Mood-keyword Pexels Videos API fetch; per-window selection. Demo mode bundles CC-BY Blender open-movie clips for zero-key first-touch.
Glitch micro-edits. 0.2 s reverse + 0.2 s forward jump-cut on detected drum onsets, but only on clips classified as static-camera so the frame doesn't shake during the glitch.
Vintage lo-fi shaders. Film grain + vignette + Gaussian zoom-pulse + phrase-aware pond ripple + halation bloom. All pure ffmpeg filter graphs — no GLSL, no external renderer.
Render + QA. ffmpeg 9:16 screen-fill, mission-level retry on failure.
Three-layer reactive audit
L1 — post-commit hook. Drift-risk commits (anything under agents/, .claude/agents/, config/, CLAUDE.md, the operator contract) fire audit-run.sh contract within ~30 s.
L2 — 15-min mission-anomaly poll. New blocker files or QA-FAIL bursts trigger a focused audit. No-op (zero tokens) when nothing's wrong.
L3 — daily 03:00 baseline. launchd fires the full sweep. Catches anything L1 + L2 missed.
The pattern is Reactor + Hook (files as events), not Observer — subagents in this repo aren't long-running observables.
Cost-routing rule
The architectural lesson from a real failure: applying "Tier 2 (local) = default" to every pipeline stage produces a quality ceiling.
Mechanical, high-volume stages (transcribe, render, fetch, beat-detect) — local. Token cost would be ruinous at scale.
One-shot creative stages (script hook, factual framing, mood-keyword extraction) — Claude. ~500 tokens per call, operationally negligible against the existing subscription quota, and quality compounds over the next 60 seconds of viewing.
Pipelines are packaged as portable Skills following the open agentskills.io standard. Two skills ship as of v0.4.0:
Skill #1 (music-video, missions-routed) and
Skill #2 (job-hunt, standalone).
A skill written once can target multiple compatible runtimes (Claude Code, Cursor, Goose, Gemini CLI, OpenAI Codex, etc.) — Skill #1 verified via a 12/12 Hermes drop-in interop test.
A separate-shape skill: standalone (no missions-routed pipeline), v2 short-keyword UX, agentskills.io-compliant. Pass --seed "Problem Solver" and the orchestrator expands to a 24-synonym role family (Forward Deployed Engineer / Applied AI Engineer / Generalist / Founding Engineer / …) before fetching from KR job boards (사람인 / 잡코리아 / 원티드 / 프로그래머스). 4 enrichment utilities (fit-score / cover-letter / company-research / interview-prep) ship as scaffolds, each gated behind a single env-flag flip.
5 source plugins — all mock-fallback by default; live HTTP per-plugin behind JH_SOURCE_LIVE=1.
4 enrichment utilities — per-posting Claude calls behind JH_TOOL_LIVE=1.
63 tests — smoke + edge-case + JSON-schema validation. EN + KO walkthroughs.
The earlier faceless-short mission still lives in the tree. Topic prompt in, narrated 60-second short out: Sonnet drafts the hook + factual framing, Kokoro-ONNX (English) or macOS Yuna (Korean) synthesizes voice, whisper.cpp transcribes for caption timing, Pexels B-roll selected per-window from caption keywords, ffmpeg burns single-line captions. Preserved for topic-driven content; not the current production format.
Frames from the earlier faceless-short trials, retained as visual evidence of the narration-era pipeline.
Hittites ENHydrogen ENAutoTune ENHittites KO
Faceless-era scorecard (historical)
Self-evaluation across five retention-mapping axes (Hook, Visual sync, Readability, Factual coherence, Production polish), assigned by Claude during the faceless-short iteration — preserved as the structured progress signal from the v4 → v5 → v6 sequence that preceded the music-video pivot. The music-video mission uses platform watch-time data instead of per-dimension scoring; per-video metrics live under docs/pilots/.
Operator-intervention trend
A multi-agent system that needs constant human steering hasn't actually replaced the work it was meant to. Honest tracking of how much the operator stays in the loop per day:
Try it
Zero-account demo path — no Pexels key, no Suno account, no .env edits:
The demo path uses bundled CC-BY Blender open-movie B-roll plus a CC-BY-4.0 demo track — a working music-video short with no external signups. Full setup (mood-keyword Pexels catalog + custom Suno tracks) is documented as the advanced path in the README Quick start.
macOS first. Linux compatible for the core pipeline (whisper.cpp + ollama + ffmpeg + aubio); macOS-only for launchd schedulers and Yuna TTS.