PythonGradioComfyUIOllamaClaude APILLMPre-Development

Cut Content

This one is different. Cut Content is simultaneously a video production pipeline under development, a YouTube channel concept, and a personal research project. It is here because it is representative of how personal work gets approached — not because it is finished.

Client	Personal project
Role	Interaction Designer — Pipeline Design, Development
Deliverable	Human-in-the-loop video production pipeline + YouTube channel
Venue	In development

The Concept

Cut Content is a history channel built around deliberate omissions — the B-sides, alternate takes, and deleted scenes of the official record. The name borrows simultaneously from music (bootlegs, outtakes), gaming (cut content, beta builds), and film (the deleted scene). The common thread: things that exist, that are interesting, that don’t get mentioned because they don’t fit a convenient narrative.

The subject matter spans wherever that thread leads — music history, political history, cultural history, niche events from any era. Lou Reed’s trans partner Rachel in the 1970s NYC underground. The Anabaptist takeover of Münster in 1534. Bob Dylan’s bootleg demo sessions. The selection criteria is personal: topics worth knowing about, approached because they’re genuinely interesting rather than algorithmically convenient.

The intended audience is primarily one person. The channel being built is the channel that would be watched — content produced for personal interest and shared in the hope that others find the same threads worth pulling.

Example topics

Lou Reed’s partner Rachel — the trans woman at the centre of 1970s NYC underground culture who largely disappeared from the official story.

The Münster Rebellion — the Anabaptist theocratic takeover of a German city in 1534–35, one of the strangest episodes of the Reformation.

Dylan’s Basement Tapes — the bootleg sessions that circulated for years before any official release, influencing a generation of musicians who technically never heard them.

The Pipeline

The channel requires a production pipeline. That pipeline is being built deliberately rather than assembled from off-the-shelf automation tools — because the off-the-shelf tools optimise for volume and speed at the cost of the thing that makes the content worth making.

01 — Research

Web search + Wikipedia → structured context doc. Grounds the LLM in facts before any scripting begins.

02 — Script

LLM draft against the research context. Human reviews, edits, and writes the final version.

03 — Visual Style

ComfyUI workflow template selected per era. Per-segment image prompts written and edited.

04 — Image Generation

Submitted to a remote GPU machine via ComfyUI. Images arrive async to Cloudflare R2 while editing continues.

05 — Narration

Local TTS generation (Kokoro / Fish Speech). Playback and approval before continuing.

06 — Compose + Publish

MoviePy stitch, Whisper subtitles, preview, Selenium upload to YouTube Studio.

The pipeline runs locally through a Gradio GUI with a mandatory human review gate between each stage. Nothing advances automatically. The LLM drafts; the human decides what stays.

The Design Decision

Automated video pipelines exist. Most of them produce recognisable output — generic narration, stock-looking visuals, scripting that could have been about anything. The problem isn’t the tooling; it’s the absence of editorial intent at every stage.

The approach here mirrors how AI-assisted coding gets used in the rest of this work: the model drafts, the human decides. The LLM is fast at research synthesis and generating a first pass at structure. It is not good at caring about whether the Lou Reed story gets told accurately or whether the Münster section has the right tone. That part requires a person.

The aesthetic strategy follows the same logic: different eras get different ComfyUI workflow templates and LoRAs — grainy 35mm Kodachrome for 1970s NYC, woodcut engraving for Reformation-era subjects. The visual language is chosen per topic, not applied uniformly. Style as editorial decision rather than default setting.

Stack

GUI — Gradio (localhost, human review gates)

LLM — Ollama (local) + Claude API (quality drafts)

Images — ComfyUI (remote GPU) + Flux.1-dev + LoRAs → Cloudflare R2

TTS — Kokoro / Fish Speech (local)

Video — MoviePy + Whisper subtitles

Upload — Selenium (YouTube Studio)

Where It Is

Currently conceptual. The architecture is defined; the build is beginning by forking MoneyPrinterV2 — an existing open-source pipeline that handles MoviePy composition, Whisper subtitle generation, and Selenium upload. The slop-optimised defaults are being stripped out and the research, scripting, and image generation stages rebuilt from scratch around the editorial approach described above.

The channel launches when the first video meets the standard. No timeline other than that.

← Gutenburger Pesto