Seedance 2.0 Prompting Guide

The Seedance 2.0 series models natively support joint audio and video generation, boasting exceptional semantic understanding and multimodal interaction capabilities. This guide introduces the prompting methods and techniques for Seedance 2.0 (also applicable to Seedance 2.0 fast) to help you efficiently generate high-quality video works that meet your specific needs.

NOTE

All visual (image, video) and audio assets presented in this guide were autonomously generated by the Official Seedance/Seedream series visual generation models.

01 General Principles

1.1 Basic Formula for Text Instructions

Seedance 2.0 deeply follows natural language logic. You can flexibly combine the following elements based on your requirements:

[Subject + Action]

The logical foundation of the instruction, identifying who is doing what.

[Scene + Atmosphere]

Describes spatial background, lighting details, or specific visual styles to define the overall tone of the frame.

[Cinematography + Sound Effects]

Uses advanced instructions such as camera movements or ambient sound effects to enrich the final audiovisual expression.

1.2 Multimodal Reference Control

Beyond text descriptions, you can use reference to lock in the ideal standard for your visuals. Seedance 2.0 supports deep referencing of images, audio, and video.

Explicit Referencing

Clearly specify the reference object, such as "Refer to the composition of Image 1" or "Refer to the action in Video 2."

Feature Integration

The model automatically extracts core features from the reference assets and combines them with your text description, ensuring high certainty and fidelity while maintaining creativity.

02 Text Generation

Seedance 2.0 supports text generation across T2V (Text-to-Video), I2V (Image-to-Video), R2V (Reference-to-Video), and V2V (Video-to-Video) scenarios.

The model can automatically match styles and colors based on context, but also allows you to specify color, style, appearance method, timing, and position in the prompt. Please prioritize common characters and avoid obscure characters or special symbols for the best results.

2.1 Slogans

Prompt Template:

Plain Text

[Text Content] + [Timing] + [Position] + [Appearance Method], [Text Features (Color, Style)]

NOTE

Seedance 2.0 matches styles based on context. If strict text performance is required, refer to 3.2 Multi-Image Reference > Logo Reference.

Example:

Generation / Result

Reference / Source

Image 1

Example: * [Prompt]: Hand-drawn manga style, three people sitting together eating the fried chicken from Image 1, atmosphere is friendly and happy. The screen gradually blurs, and the text "Happiness is in Seedance" appears in the center.

2.2 Subtitles

Prompt Template:

Plain Text

Subtitles appear at the bottom of the screen, the content is "...", subtitles must be perfectly synchronized with the audio rhythm.

Examples:

Generation / Result

Reference / Source

Example 1: * [Prompt]: Generate a video with a voiceover. A deep, calm male voice says: "在宏大的宇宙中，我们的世界不过是一个短暂的瞬间。然而，在其中，生命不顾一切地繁荣。" The scene should transition slowly from night to dawn, stars fading, sun rising behind mountains. Subtitles appear at the bottom following the lines.

Generation / Result

Reference / Source

Example 2: * [Prompt]: The two people in the image are chatting in an office. The woman speaks first, saying: "你每次卡点到，是不是很享受这种刚刚好的感觉？" The man smiles and responds: "我有我的节奏" The dialogue is casual and natural, and corresponding subtitles appear at the bottom.

2.3 Speech Bubbles

Prompt Template:

Plain Text

[Character] says: "...", a speech bubble appears around the character as they speak, with the lines written inside.

Example:

Generation / Result

Reference / Source

Example: * [Prompt]: Referencing the girl in Image 1 and Image 2, she is in a strawberry garden, picks one, takes a bite, and smiles saying: "This is the real deal!" A speech bubble appears around her with the lines written inside.

03 Image Reference

Seedance 2.0 supports multi-view subject reference as well as multi-image reference for scenes and storyboards. Use Image 1, Image 2... Image n for accurate identification.

3.1 Multi-view Subject Reference

Prompt Template:

Plain Text

Refer to/Extract/Combine the [Subject] in [Image n], generate [Scene Description], keep [Subject] features consistent.

Examples:

Generation / Result

Reference / Source

3C Digital: Extract the camera from Image 1, 2, and 3, change the background to white. The camera is on a white table. The lens focuses on the camera in a close-up, then slowly rotates around it to show the front, side, and back.

Generation / Result

Reference / Source

Home Goods: Warm home setting, mid-shot of the thermos from the reference image. The camera zooms in to a close-up. A hand enters the frame, naturally holds and picks up the thermos, with a slight rotation to show details.

Generation / Result

Reference / Source

Characters: Refer to the woman in Image 1, 2, and 3, generate a scene of her eating cake in a coffee shop.

3.2 Multi-Image Reference

Prompt Template:

Plain Text

Refer to/Extract/Combine [Referenced Element] in [Image n], generate [Scene Description], keep [Referenced Element] features consistent.

Examples:

Generation / Result

Reference / Source

Logo Reference: Cyberpunk 3D sci-fi animation style. Referencing the girl in Image 2, show her releasing a silver floating lamp with a hologram. Zoom out to show the sky full of lamps. The screen blurs, then the Logo from Image 1 appears.

Generation / Result

Reference / Source

Multi-Subject Reference: Refer to the cat and dog in the images. In a cozy apartment, the dog is eating. The cat walks over, touches the dog with a paw. The dog stops eating, and the cat snuggles next to it. Warm tones.

Generation / Result

Reference / Source

Multi-element reference: The scene is set in the restaurant shown in image 4, where people are coming and going. The girl in image 1, wearing the same outfit as in image 2, is tidying up items on the counter. The boy in image 3 is a customer who approaches the girl, wanting to ask for her contact information. The logo in image 5 is always displayed in the lower right corner of the image.

Generation / Result

Reference / Source

Multi-panel storyboard reference: Refer to the storyboard in the image to generate an intense fight scene. The various storyboard compositions in the image should appear in sequence, followed by the intense fight between the two characters.

Generation / Result

Reference / Source

Storyboard Reference: Follow the storyboard composition in Image 3. A girl is waiting for her father to cook. She says: "아빠， 배고파요！ 밥 다 됐어요" Her appearance refers to Image 1. Pan right to the composition of Image 4, the father refers to Image 2. He answers: "이제 진짜 금방이야。 "

04 Video Reference

Seedance 2.0 supports video references. When using it, clearly specify the generated content and the reference object.

During use, if there are requirements regarding the video order, upload them in sequence. The prompts can use Video 1, Video 2,... Video n for accurate reference.

4.1 Motion Reference

Prompt Template:

Plain Text

Refer to the [Motion Description] of [Video n], generate [Scene Description], keep motion details consistent.

Example:

Generation / Result

Reference / Source

Video 1

Example (Marketing): * [Prompt]: Refer to the galloping form of the horse in Video 1, generate a golden horse running across a prairie, then freeze its magnificent running posture into a horse-shaped gold pendant.

4.2 Cinematography Reference

Prompt Template:

Plain Text

Refer to the [Camera Movement] of [Video n], generate [Scene Description], keep cinematography consistent.

Example:

Generation / Result

Example: * [Prompt]: Refer to the camera movement of Video 1, make a concept video for a tech park. Use the skyscraper in Image 1 as the visual center, using a first-person dive to emphasize the futuristic feel.

Reference / Source

Reference 1

Reference 2

4.3 Special Effects Reference

Prompt Template:

Plain Text

Refer to the [Effect Description] of [Video n], generate [Scene Description], keep effects consistent.

Example:

Generation / Result

Example: [Prompt]: Refer to the golden particle effect in Video 1, have the character in Image 2 play the flute while surrounded by the same particle effects.

Reference / Source

Reference 1

Reference 2

05 Video Editing

Seedance 2.0 supports adding, deleting, or modifying elements, extending video duration, and track completion.

During use, if there are requirements for the video order, they should be uploaded sequentially. The prompts can use Video 1, Video 2... Video n for accurate reference.

5.1 Element Add/Delete/Modify

Prompt Templates:

Plain Text

1. Add: At [Time Position] + [Space Position] of [Video n], add [Desired Element].
2. Delete: Delete [Element] from [Video n], keep other content unchanged.
3. Modify: Replace [Element to Change] in [Video n] with [Ideal Element].

Examples:

Generation / Result

Reference / Source

Example (Add): * [Prompt]: Add snacks such as fried chicken and pizza to the countertop in Video 1.

Generation / Result

Reference / Source

Example (Delete): * [Prompt]: Remove all other parts and tools from the desktop in Video 1, keeping the desktop clean and tidy, with only what they are holding on it.

Generation / Result

Example (Modify): * [Prompt]: Replace the perfume in Video 1 with the face cream in Image 1, keeping the action and camera movement the same.

Reference / Source

Reference 1

Reference 2

5.2 Video Extension

Prompt Template:

Plain Text

Extend [Video n] forward/backward + [Description of extension]

Generate content before/after [Video n] + [Description of extension]

NOTE

The model automatically handles the transition/stitching between segments.

Examples:

Generation / Result

Reference / Source

Example (Backward): * [Prompt]: Generate content after Video 1, the two late men run toward them, and the five people finally meet and chat friendly.

Generation / Result

Reference / Source

Example (Extend forward): * [Prompt]: Extending the video forward to point to the man in white, who is shown over his shoulder, he says, "It's not that bad. You're just stressed. Everyone goes through this; you just need to keep going."

5.3 Track Completion (In-filling)

Prompt Template:

Plain Text

[Video 1] + [Transition Description] + followed by [Video 2] + [Transition Description] + followed by [Video 3]

NOTE

Supports up to 3 video inputs, total duration not exceeding 15 seconds. During generation, the connection between the first and last videos will be automatically extracted, and only the necessary segments will be retained for compositing.

Example:

Generation / Result

Example: * [Prompt]: Video 1, the moment the leaf hits the ground, it triggers a golden particle effect. A gust of wind blows, transitioning to Video 2.

Reference / Source

Reference 1

Reference 2