Cookbook › Faceless YouTube
Workflow·Practical·60–90 min·ChatGPT · ElevenLabs · CapCut · Pictory
How to create a faceless YouTube video with AI
Faceless is the most popular format for starting a channel without exposing yourself. The trick is in the pipeline — not in any single tool. This recipe ties the 4 pieces into a flow that produces an 8-10 minute video in 90 minutes of work.
What this recipe solves
You want to start a channel but: you don't want to be on camera, you don't have a team, you don't know how to edit. The pipeline below removes all 3 blockers.
5-step pipeline
Pick a niche — "evergreen + searched"
Faceless works in niches where authority comes from content, not the creator. Good fits: history docs, basic finance, mysteries, pop science, top-10s, biographies, oddly satisfying.
Bad fits: niches that depend on charisma (comedy, lifestyle, vlog).
Test: search 5 videos with >500k views in your idea. If all show a face, skip. If half don't, go.
Write the script (HVC structure)
HVC = Hook 5s + Value (body) + CTA. The first 5 seconds drive 80% of retention.
Paste this into ChatGPT:
Generate the voice (ElevenLabs)
For faceless, pick a Narration voice — authoritative, clear. Stability 50-60%, Style 20%. Adam, Bill, Charlotte work well.
Split the script into 500-character chunks before generating. ElevenLabs breathes better between chunks.
Visuals: 3 strategies
- Stock footage (Pictory, Storyblocks) — fastest, less unique
- AI images + Ken Burns (Midjourney + CapCut zoom) — balanced
- AI short clips (Veo 5s clips chained) — most cinematic, slowest
Starter channel: option 1 or 2. Option 3 only after 5 published videos.
Edit in CapCut
- Track 1: voice
- Track 2: visuals (1 every 4-6s, with Ken Burns micro-zoom)
- Track 3: ambient music at -22dB (Epidemic Sound, Artlist)
- Track 4: punctual SFX (whoosh on transitions)
- Auto-captions, then proofread
Export 1080p, 30fps. Build the thumbnail separately (Photopea or Canva).
Hook example (HVC structure)
[promise]: "Today we follow the trail of these letters. What they uncovered shouldn't exist."
Minimum stack (tools and cost)
- ChatGPT Plus — $20/mo (script)
- ElevenLabs Starter — $5/mo (voice)
- Pictory or Storyblocks — $25/mo (stock)
- CapCut — free (edit)
- Music — Epidemic Sound $12/mo
Total ~$62/mo. Enough for 4-8 videos a month.
Common mistakes
- Robotic voice. Stability above 70% kills expression. Keep it 50-60%.
- A visual every 2 seconds. Viewers get nauseous. 1 visual every 4-6s is the sweet spot.
- Hook with no promise. The hook intrigues — but the second line must promise concrete value, not "come find out."
- Skipping the thumbnail. 50% of CTR is the thumbnail. Spend 30 minutes on it.
- Publishing before having 5. The YouTube algorithm needs signal — line up 5 in a series before evaluating.
Related recipes
Once you have 5 published, jump to Module 4 of the course: Scale — production at volume with AI.
Open Module 4 →