AI SMART CROP

AI Smart Crop: Reframe Any Video from Landscape to Portrait

Turn landscape podcasts, interviews, and streams into vertical clips with AI face tracking that follows the active speaker frame by frame.

Real-time face tracking

DeepSkim uses a neural network (S3FD) to detect every face in every frame, then TalkNet audio-visual analysis to determine who is speaking. The 9:16 crop window automatically follows the active speaker with smooth, Gaussian-filtered motion — no jumpiness, no manual keyframing.

Split-screen for multi-speaker content

When two people are talking — like in an interview or debate — DeepSkim automatically switches to a split-screen layout showing both speakers. It detects when speaker turns happen and transitions smoothly between single-speaker and split views.

Three crop modes

Autocrop — AI face-tracking crop that fills the 9:16 frame with the most relevant content
Blurred background — Full frame visible with a blurred version behind, for a cinematic look
Black background — Full frame on a clean black background, ideal for text-heavy content

Fallback detection for B-roll

When there are no faces on screen — like during screenshares, slides, or B-roll — DeepSkim uses a 4-tier fallback pipeline (scene detection, object detection, saliency mapping, optical flow) to keep the crop centered on the most interesting part of the frame.

Reframe your landscape videos in minutes

Get started for free