Two portable skills. One repo.
music-video shorts · Korean job-board digest · $0 runtime · EN + KO

Skill #1 (music-video): a music track becomes a 9:16 vertical short — phrase-aware cuts, variable-speed mood pacing, onset-aligned glitch micro-edits, vintage lo-fi shaders. Skill #2 (job-hunt): a single seed keyword like "Problem Solver" expands to the full family of equivalent titles (FDE / Applied AI Engineer / Generalist / …) and produces a deduplicated digest of matching Korean job postings. Mechanical stages stay local (aubio, whisper.cpp, ffmpeg, jq). Creative stages opt into Claude (Tier-1) — covered by the operator's existing subscription, no incremental USD. The system audits itself on every commit.

View on GitHub See the output How it works
5-second animated preview of the music-video pipeline output — Velvet Turntable lo-fi jazz track, jazz/vintage Pexels B-roll, phrase-aware pond ripple + halation shader combo, 9:16 vertical short, mid-climax window (25-30 s) showing pond surface displacement and warm halation bloom over smoky lounge interior.
5-second slice of the first uploaded music-video short — Suno-generated lo-fi jazz track, B-roll from Pexels, render + shader pass by local ffmpeg.

How it works

The scaffold is general-purpose. Short-form video is the v1 mission domain — chosen because the deliverable is visually verifiable and the failure modes are quick to catch. Current focus is the music-video mission (music-as-sole-audio, beat-aligned cuts, onset-aligned glitch micro-edits, vintage lo-fi shaders) — operator-validated as the production format after eight narration-driven trials. The earlier faceless-short mission (narration-driven) and the v1 highlight / shorts-batch missions remain in the tree as alternate paths.

Pipeline (music-video mission)

  1. Beat extraction. aubiotrack finds real beats; sub-beat noise rejected. Cuts land every Nth beat (default 12 — about one cut per 7.5 s at 95 BPM).
  2. Phrase alignment. aubioonset detects drum hits. Variable per-clip setpts by mood: slow scenes 0.55×, ambient 0.70×, active 0.80×, natural 1.00× — the music drives the visual pace.
  3. B-roll. Mood-keyword Pexels Videos API fetch; per-window selection. Demo mode bundles CC-BY Blender open-movie clips for zero-key first-touch.
  4. Glitch micro-edits. 0.2 s reverse + 0.2 s forward jump-cut on detected drum onsets, but only on clips classified as static-camera so the frame doesn't shake during the glitch.
  5. Vintage lo-fi shaders. Film grain + vignette + Gaussian zoom-pulse + phrase-aware pond ripple + halation bloom. All pure ffmpeg filter graphs — no GLSL, no external renderer.
  6. Render + QA. ffmpeg 9:16 screen-fill, mission-level retry on failure.

Three-layer reactive audit

  • L1 — post-commit hook. Drift-risk commits (anything under agents/, .claude/agents/, config/, CLAUDE.md, the operator contract) fire audit-run.sh contract within ~30 s.
  • L2 — 15-min mission-anomaly poll. New blocker files or QA-FAIL bursts trigger a focused audit. No-op (zero tokens) when nothing's wrong.
  • L3 — daily 03:00 baseline. launchd fires the full sweep. Catches anything L1 + L2 missed.

The pattern is Reactor + Hook (files as events), not Observer — subagents in this repo aren't long-running observables.

Cost-routing rule

The architectural lesson from a real failure: applying "Tier 2 (local) = default" to every pipeline stage produces a quality ceiling.

  • Mechanical, high-volume stages (transcribe, render, fetch, beat-detect) — local. Token cost would be ruinous at scale.
  • One-shot creative stages (script hook, factual framing, mood-keyword extraction) — Claude. ~500 tokens per call, operationally negligible against the existing subscription quota, and quality compounds over the next 60 seconds of viewing.

Full reasoning in docs/cost-model.md.

Skills layer (agentskills.io-compliant)

Pipelines are packaged as portable Skills following the open agentskills.io standard. Two skills ship as of v0.4.0: Skill #1 (music-video, missions-routed) and Skill #2 (job-hunt, standalone). A skill written once can target multiple compatible runtimes (Claude Code, Cursor, Goose, Gemini CLI, OpenAI Codex, etc.) — Skill #1 verified via a 12/12 Hermes drop-in interop test.

Spec: skills/music-video/ · skills/job-hunt/

Skill #2 — job-hunt (short-keyword KR job-board digest)

A separate-shape skill: standalone (no missions-routed pipeline), v2 short-keyword UX, agentskills.io-compliant. Pass --seed "Problem Solver" and the orchestrator expands to a 24-synonym role family (Forward Deployed Engineer / Applied AI Engineer / Generalist / Founding Engineer / …) before fetching from KR job boards (사람인 / 잡코리아 / 원티드 / 프로그래머스). 4 enrichment utilities (fit-score / cover-letter / company-research / interview-prep) ship as scaffolds, each gated behind a single env-flag flip.

  • 5 source plugins — all mock-fallback by default; live HTTP per-plugin behind JH_SOURCE_LIVE=1.
  • 4 enrichment utilities — per-posting Claude calls behind JH_TOOL_LIVE=1.
  • 63 tests — smoke + edge-case + JSON-schema validation. EN + KO walkthroughs.

Quickstart: skills/job-hunt/scripts/run.sh --seed "Problem Solver" --dry-run. Reference digest at docs/samples/job-hunt-digest-mock.md.

Alternate path — faceless-short (narration era)

The earlier faceless-short mission still lives in the tree. Topic prompt in, narrated 60-second short out: Sonnet drafts the hook + factual framing, Kokoro-ONNX (English) or macOS Yuna (Korean) synthesizes voice, whisper.cpp transcribes for caption timing, Pexels B-roll selected per-window from caption keywords, ffmpeg burns single-line captions. Preserved for topic-driven content; not the current production format.

Pipeline detail in the README Sample output section.

Evidence — what the pipeline actually produces

Frames captured from rendered mp4s. Full mp4s live under records/missions/ (gitignored — each ~ 25–50 MB).

Mid-climax frame from the first uploaded music-video short — 9:16 vertical, vintage jazz interior with amber lamp, vinyl record motif, warm halation bloom around bright sources and subtle pond-surface ripple displacement across the whole frame, Pexels-licensed B-roll under a Suno-generated lo-fi jazz track.
music-video Velvet1 · jazz combo — first uploaded short, t = 30 s. Phrase-aware pond ripple + halation shader combo on Suno lo-fi jazz track.

Faceless-short gallery (historical — narration era)

Frames from the earlier faceless-short trials, retained as visual evidence of the narration-era pipeline.

Faceless-era scorecard (historical)

Self-evaluation across five retention-mapping axes (Hook, Visual sync, Readability, Factual coherence, Production polish), assigned by Claude during the faceless-short iteration — preserved as the structured progress signal from the v4 → v5 → v6 sequence that preceded the music-video pivot. The music-video mission uses platform watch-time data instead of per-dimension scoring; per-video metrics live under docs/pilots/.

Stacked bar chart: 13 pilot rows from Hittites EN v4 26/50 (lowest) up to AutoTune EN 45/50 (highest), split across the 5 dimensions with colors. Music-trial topics cluster around 40-45.

Operator-intervention trend

A multi-agent system that needs constant human steering hasn't actually replaced the work it was meant to. Honest tracking of how much the operator stays in the loop per day:

Daily commit count from 2026-05-14 through 2026-05-20, stacked by initiator (agent-autonomous blue vs user-initiated red), with a black line showing user-initiated percentage. The user-initiated share peaks on 2026-05-17 at 69% (manual pivot day) and trends back down as overnight autonomy work resumes.

Try it

Zero-account demo path — no Pexels key, no Suno account, no .env edits:

git clone --depth 1 https://github.com/MelonS/MelonS-Agents.git
cd MelonS-Agents
./scripts/bootstrap.sh                # verifies tools, fetches models
MUSIC_VIDEO_DEMO_MODE=1 \
  ./agents/missions/music-video/run.sh demo

The demo path uses bundled CC-BY Blender open-movie B-roll plus a CC-BY-4.0 demo track — a working music-video short with no external signups. Full setup (mood-keyword Pexels catalog + custom Suno tracks) is documented as the advanced path in the README Quick start.

macOS first. Linux compatible for the core pipeline (whisper.cpp + ollama + ffmpeg + aubio); macOS-only for launchd schedulers and Yuna TTS.