본문으로 건너뛰기

© 2026 Molayo

Vercel헤드라인2026. 04. 24. 17:16

AI Gateway로 구현하는 영상 생성 기능과 활용 방법

요약

AI Gateway가 비디오 생성 기능을 지원하며, 이를 통해 포토리얼리스틱한 고품질의 시네마틱 영상을 제작할 수 있습니다. 단순 텍스트 프롬프트 외에도 모션 큐(motion cues), 오디오 지시 등을 포함하여 정교한 영상 생성이 가능합니다. 본 가이드는 텍스트-비디오, 이미지 애니메이션, 시작/끝 프레임 정의, 캐릭터 일관성 유지 등 다양한 고급 비디오 생성 모드를 제공하며, 개발자는 AI SDK 6를 통해 프로그래밍 방식으로 쉽게 통합할 수 있습니다.

핵심 포인트

  • AI Gateway는 포토리얼리스틱한 품질의 영상 생성을 지원하며, 현재 Pro 및 Enterprise 플랜 사용자에게 베타로 제공됩니다.
  • 영상 생성은 단순 프롬프트를 넘어 모션 큐(motion cues)와 오디오 지시를 포함하여 정교하게 제어할 수 있습니다.
  • 주요 기능으로는 '텍스트-비디오' 외에도, 이미지 애니메이션, 시작/끝 상태 정의(Before/After), 캐릭터 일관성 유지(Reference-to-video) 등이 가능합니다.
  • AI SDK 6를 사용하면 코딩 없이도 AI Gateway Playground에서 다양한 비디오 모델을 비교하고 실험할 수 있습니다.

Video Generation with AI Gateway

AI Gateway now supports video generation, so you can create cinematic videos with photorealistic quality, synchronized audio, generate personalized content with consistent identity, all through AI SDK 6. Video generation is in beta and currently available for Pro and Enterprise plans and paid AI Gateway users.

Video models require more than just describing what you want. Unlike image generation, video prompts can include motion cues (camera movement, object actions, timing) and optionally audio direction. Each provider exposes different capabilities through that unlock fundamentally different generation modes. See the [providerOptions] documentation for model-specific options.

AI Gateway initially supports 4 types of video generation:

Across the model creators, their current capabilities across the models on AI Gateway are listed below:

  • Describe what you want, get a video. The model handles visuals, motion, and optionally audio. Great for hyperrealistic, production-quality footage with just a simple text prompt.
  • Generate videos on demand for your app, platform, or content pipeline. No licencing fees or production required, just prompts and outputs. Example: Programmatic video at scale. This example uses [klingai/kling-v2.6-t2v] to generate video from a text prompt with a specified aspect ratio and duration.
  • Turn a simple prompt into polished video clips for social media, ads, or storytelling with natural motion and cinematic quality. Example: Creative content generation. By setting a very specific and descriptive prompt, [google/veo-3.1-generate-001] generates video with immense detail and the exact desired motion.
  • Provide a starting image and animate it. Control the initial composition, then let the model generate motion. Turn existing product photos into interactive videos. Example: Animate product images. The model animates a product image after you pass an image URL and motion description in the prompt. [klingai/kling-v2.6-i2v]
  • Bring static artwork to life with subtle motion. Perfect for thematic content or marketing at scale. Example: Animated illustrations. Add subtle motion to food, beverage, or lifestyle shots for social content. Example: Lifestyle and product photography. Here, a picture of coffee is rendered for a more interactive video, with lighting direction and minute details.
  • Define the start and end states, and the model generates a seamless transition between them. Outfit swaps, product comparisons, changes over time. Upload two images, get a seamless transition. Example: Before/after reveals. The start and end states are defined here with two images that used in the prompt and provider options. In this example, lets you define the start frame in and the end frame in . The model generates the transition between them. [klingai/kling-v3.0-i2v]
  • Image lastFrameImage Provide reference videos or images of a person/character, and the model extracts their appearance and voice to generate new scenes starring them with consistent identity. In this example, 2 reference images of dogs are used to generate the final video.
  • Using [alibaba/wan-v2.6-r2v-flash], you can instruct the model to utilize the people/characters within the prompt. Wan suggests using [character1], [character2], etc. in the prompt for multi-reference to video to get the best results.
  • Transform existing videos with style transfer. Provide a video URL and describe the transformation you want. The model applies the new style while preserving the original motion. Here, [xai/grok-imagine-video] utilizes a source video from a previous generation to edit into a watercolor style.

For more examples and detailed configuration options for video models, check out the [Video Generation Documentation]. You can also find simple getting started scripts with the [AI SDK 6].

Video Generation Quick Start

  • Two ways to get started:
    1. Generate videos programmatically with the same interface you use for text and images. One API, one authentication flow, one observability dashboard across your entire AI pipeline.
    2. AI SDK 6: Experiment with video models with no code in the configurable that's embedded in each model page. Compare providers, tweak prompts, and download results without writing code. To access, click any video gen model in the [AI Gateway Playground].

AI Gateway playground model list:

  • From xAI is fast and great at instruction following.
  • Grok Imagine from Alibaba specializes in reference-based generation and multi-shot storytelling, with the ability to preserve identity across scenes.
  • Wan excels at image to video and native audio. The new 3.0 models support multishot video with automatic scene transitions.
  • Kling from Google delivers high visual fidelity and physics realism.
  • Native audio generation with cinematic lighting and physics.
ModelTypeInputsDescriptionExample use cases
VeoText-to-videoText promptDescribe a scene, get a videoAd creative, explainer videos, social content
Image-to-videoImage, text prompt optionalAnimate a still image with motionProduct showcases, logo reveals, photo animation
First and last frame2 images, text prompt optionalDefine start and end states, model fills in betweenBefore/after reveals, time-lapse, transitions
Reference-to-videoImages or videosExtract a character from reference images or videos and place them in new scenesSpokesperson content, consistent brand characters
Model Creator Capabilities
xAIText-to-video, image-to-video, video editing, audio
WanText-to-video, image-to-video, reference-to-video, audio
KlingText-to-video, image-to-video, first and last frame, audio
VeoText-to-video, image-to-video, audio

AI 자동 생성 콘텐츠

본 콘텐츠는 Vercel AI의 원문을 AI가 자동으로 요약·번역·분석한 것입니다. 원 저작권은 원저작자에게 있으며, 정확한 내용은 반드시 원문을 확인해 주세요.

원문 바로가기
4

댓글

0