← Blog · Tutorial · 8 min read

How to Auto-Generate YouTube Shorts with AI and n8n in 2026

Step-by-step guide to building an automated YouTube Shorts pipeline using n8n, AI tools, and ffpipe for video processing.

ffpipe Team

February 3, 2026 · Updated Apr 14, 2026

YouTube Shorts automation is the process of using AI and workflow tools to automatically extract, resize, and publish short-form vertical video clips from long-form content. By combining n8n, OpenAI Whisper for transcription, GPT-4 for clip selection, and ffpipe for video processing, creators can generate 5–10 Shorts from a single 30-minute video in under an hour — without manual editing.

Key Takeaways

AI-powered pipeline: Whisper transcribes → GPT-4 finds best moments → ffpipe crops to 9:16

Total cost: under $5/month for one video per week

Processing runs in parallel — 10 clips take the same time as 1

Review via Slack before publishing to maintain quality control

Why YouTube Shorts automation matters

YouTube Shorts are short (under 60 seconds), vertical videos that drive massive engagement. If you’re a creator, educator, or marketer, YouTube wants you posting Shorts constantly. But manually cutting clips from long-form content, finding the best moments, resizing them, and uploading them is exhausting work.

Imagine this: you publish a 30-minute video. AI automatically finds the best 5-second moments, turns each into a 9:16 vertical Shorts video, generates a thumbnail, and queues them for upload. You spend 10 minutes reviewing, not 2 hours editing.

This is possible right now with n8n, OpenAI, and ffpipe. Let’s build it.

The pipeline: from long-form to Shorts

Here’s how the automation works:

Trigger: New video uploaded (YouTube, Google Drive, or webhook)
Transcription: Extract audio and transcribe to text using OpenAI Whisper
AI Clip selection: Use GPT-4 to find the most engaging 5-10 second excerpts from the transcript
Video extraction: ffpipe crops each moment from the original video
Resize: ffpipe converts from original aspect ratio to 9:16 vertical
Thumbnail: ffpipe extracts a frame to use as thumbnail
Queue: Save Shorts to Google Drive or Airtable for review/upload

The entire pipeline runs automatically. You just publish the original video and check back in an hour.

Building the n8n workflow

Step 1: Trigger and transcription

YouTube/Google Drive Trigger
  → HTTP Request (get video metadata)
  → ffpipe (extract audio to WAV)
  → OpenAI (Whisper: transcribe audio)
  → JSON Parse (organize transcript with timestamps)

Use OpenAI’s Whisper API. It returns the full transcript plus timestamps for each sentence. This is critical — you need to know exactly when the good moments happen.

Step 2: AI-powered clip selection

Send the transcript to GPT-4 with a prompt:

Analyze this transcript and identify the 5 most engaging, quotable moments
that would work as 5-10 second YouTube Shorts clips. For each moment:

1. Return the start and end timestamps
2. Explain why it's engaging
3. Suggest a short hook text for the thumbnail

Format as JSON array.

GPT-4 will return something like:

[
  {
    "start_seconds": 120,
    "end_seconds": 128,
    "reason": "Surprising stat about AI adoption",
    "hook": "AI adoption up 340% YoY"
  },
  {
    "start_seconds": 456,
    "end_seconds": 464,
    "reason": "Practical tip with immediate value",
    "hook": "Save 10 hours weekly with this trick"
  }
]

Step 3: Extract and resize with ffpipe

For each clip returned by GPT-4, create a parallel ffpipe call:

Loop over GPT clips
  → ffpipe (trim video: start 120s, duration 8s)
  → ffpipe (resize to 9:16 vertical for YouTube Shorts)
  → ffpipe (generate thumbnail at frame 2s, quality 5)
  → Merge results

Use the social media resize preset:

Input: Original video URL
Preset: social_media_resize
Options: { "platform": "youtube_shorts", "aspect_ratio": "9:16" }

This handles color space conversion, letterboxing, and aspect ratio adjustment automatically.

Step 4: Save and queue

Loop over processed Shorts
  → Google Drive Upload (save Short_1.mp4, Short_2.mp4, etc.)
  → Airtable Create Row (log Short details: title, thumbnail URL, hook text)
  → Slack Notify (send links for review)

Handling edge cases

What if the video is too short? Add a length check at the start. If under 3 minutes, skip the automation and alert you manually.

What if no engaging moments are found? GPT-4 can fail if the content is too dry or unstructured. Add error handling: if fewer than 3 clips are returned, notify you to manually select clips.

What if a clip contains watermarks or logos you want to remove? ffpipe doesn’t have built-in watermark removal, but you can use a separate tool (e.g., Runway, Remove.bg for video) or add a warning to your review queue.

Cost breakdown

Assuming you’re processing one 30-minute video per week:

OpenAI Whisper: ~$0.006 per minute = $0.18/week
OpenAI GPT-4: ~$0.02 for clip analysis = negligible
ffpipe: ~2 minutes of processing per video (trim, resize, thumbnail) = $0.02/week (Starter plan, $19/month for 2,000 min includes plenty of headroom)
n8n: Free (or your current plan if you already use it)

Total: under $5/month for one video per week. Scale to daily videos and you’re still under $100/month.

Example: workflow JSON structure

Here’s a simplified view of what you’re building in n8n:

Workflow: Auto-Generate Shorts
├─ Trigger: YouTube notification
├─ Extract metadata (title, channel, duration)
├─ ffpipe: Extract audio
├─ OpenAI: Transcribe
├─ OpenAI: Analyze & find clips
├─ Loop: For each clip found
│  ├─ ffpipe: Trim to clip duration
│  ├─ ffpipe: Resize to 9:16
│  ├─ ffpipe: Generate thumbnail
│  └─ Save results
├─ Merge & deduplicate
└─ Notify: Send review links via Slack

The workflow runs in parallel where possible (all ffpipe calls for different clips run simultaneously), so even processing 10 clips takes only as long as the slowest single clip.

Publishing the Shorts

After review in Slack, you have two options:

Option 1: YouTube Studio API Use n8n’s YouTube node to upload directly to your channel. Set all Shorts to “Unlisted” until you’re confident in the selection, then make them public batch-by-batch.

Option 2: Manual review & upload Download the Shorts from Google Drive, watch them, then upload through YouTube Studio with custom titles and hooks.

Most creators choose Option 2 initially (more control), then switch to Option 1 after a few weeks of confidence.

Optimizing your pipeline

Run weekly, not daily: Process one long-form video per week, extract 5–10 Shorts, stagger uploads across the week
A/B test hooks: Compare which hooks get more clicks in YouTube Analytics. Feed learnings back to your GPT prompt.
Use consistent music: Add royalty-free background music to every Short using a separate ffpipe call. This improves retention dramatically.
Vary thumbnails: Don’t reuse the same frame. Extract 3 different frames per clip and A/B test which one gets more clicks.

Getting started

Create a free ffpipe account
Set up n8n (free tier is sufficient)
Connect OpenAI API
Build the workflow above step-by-step
Test with one video first

The entire setup takes 2–3 hours if you’ve built n8n workflows before, or a full day if you’re new to automation.

Ready to automate Shorts creation? Start free →

Frequently asked questions

How long does it take to generate Shorts from a 30-minute video?

The full pipeline — transcription, AI analysis, video extraction, and resizing — takes approximately 30–60 minutes depending on the number of clips. Processing runs in parallel, so extracting 10 clips takes roughly the same time as extracting 1.

What does the AI-powered Shorts pipeline cost?

For one 30-minute video per week: OpenAI Whisper costs ~$0.18/week, GPT-4 analysis ~$0.02, and ffpipe processing ~$0.02. Total: under $5/month. Scaling to daily videos stays under $100/month.

Can the AI reliably find good clip moments?

GPT-4 identifies engaging moments based on surprising stats, practical tips, and quotable statements in the transcript. It works well for structured content but may underperform on abstract or highly visual content. Add error handling to alert you if fewer than 3 clips are returned.

Do I need to manually upload the Shorts to YouTube?

No. You can upload directly via the YouTube Studio API through n8n’s YouTube node. Most creators start with manual upload (for quality control), then switch to automated publishing after gaining confidence.

Glossary

YouTube Shorts: Vertical videos under 60 seconds published on YouTube, displayed in a dedicated Shorts feed.
9:16 aspect ratio: Vertical video format standard for Shorts, TikTok, and Instagram Reels (1080×1920 pixels).
Whisper API: OpenAI’s speech-to-text model that transcribes audio with word-level timestamps.
Clip extraction: Trimming a specific time range from a longer video to create a standalone short clip.