What you’ll learn
- Ship Seedance 1.5 Pro via GPTImage2’s async Task API (create → status → results).
- Use
callback_urlsafely (HTTPS-only, retry semantics, and failure handling). - Use one endpoint for text-to-video, image-to-video, and first-last-frame.
- Prevent double-processing with idempotency keys and retry budgets.
- Re-host outputs before links expire (24-hour retention window).
- Model “cost per successful output” without relying on fragile pricing snapshots.
- Compare integration tradeoffs vs Veo 3.1 and Kling O1 (shape-based, not price-based).
Quickstart checklist (copy/paste into your ticket)
- Implement
POST /v1/videos/generationsto create tasks - Implement
GET /v1/tasks/{task_id}polling - Implement
callback_urlendpoint (2xx fast + async processing) - Add idempotency key per user intent (“Generate” click)
- Re-host video assets within 24 hours
- Instrument: p50/p95 latency, policy failure rate, retry rate, attempts-per-success

GPTImage2 API contract (what you actually ship)
GPTImage2 exposes Seedance 1.5 Pro as an asynchronous generation flow:
- Create a task → receive
task_id - Either poll
GET /v1/tasks/{task_id}or receive a callback atcallback_url - On completion, retrieve
resultsURLs and download immediately (links expire)
Endpoint overview
Create task
POST https://api.gptimage2/v1/videos/generations
Query task
GET https://api.gptimage2/v1/tasks/{task_id}
Retention
- Output links are valid for 24 hours. Re-host promptly.
One endpoint, three modes (mode detection via image_urls)
GPTImage2 infers the mode from the length of image_urls:
0 images→ text-to-video1 image→ image-to-video2 images→ first-last-frame (first frame + last frame guidance)
Constraints
- Max 2 images per request
- Each image ≤ 10MB
- Formats: jpg/jpeg/png/webp
- URLs must be directly accessible by the server
Request fields that matter in production
These are the parameters you’ll actually operationalize:
model: use"seedance-1.5-pro"prompt(required): up to 2000 tokensduration: default 5s; supported 4–12sOperational note: billing scales with duration; treat duration as a first-class budget lever.
quality:480por720p(default720p)aspect_ratio:16:9,9:16,1:1,4:3,3:4,21:9,adaptive(default16:9)generate_audio: boolean (defaulttrue)Tip: put dialogue in double quotes to improve spoken lines.
callback_url: HTTPS-only callback URL for completed/failed/cancelled tasks (recommended)
Parameter defaults by scenario (fast decisions)
This table is designed to be quoted in docs / PRDs.
| Scenario | Suggested duration | quality | aspect_ratio | generate_audio | Notes |
|---|---|---|---|---|---|
| Talking head / lip-sync | 6–8s | 720p | 9:16 or 16:9 | true | Put spoken lines in "double quotes" |
| Ambient / b-roll | 5–8s | 720p | 16:9 | true/false | If audio isn’t essential, consider drafting without audio |
| Product demo / motion | 4–6s | 720p | adaptive | false→true | Draft without audio; final with audio only if needed |
| Storyboard iteration | 4–5s | 480p | 16:9 | false | Optimize for iteration speed; finalize later |
| First-last-frame continuity | 6–10s | 720p | match your shots | true/false | Provide 2 images; keep composition consistent |
Task lifecycle: design your state machine first
A reliable integration starts with a predictable internal state machine:
created → queued/processing → succeeded → (download → rehost → delivered)
└──────→ failed (policy | transient | internal)
└──────→ cancelled
Key rule: treat callbacks as a signal, not the source of truth. Always re-query the task before finalizing state.
Minimal working examples (GPTImage2)
1) Create a task (cURL) — copy/paste safe
This example avoids shell-escaping pitfalls by using a prompt without apostrophes.
curl --request POST \
--url https://api.gptimage2/v1/videos/generations \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"model": "seedance-1.5-pro",
"prompt": "A detective in the rain says: \"Do not move.\" Neon reflections on the street. Subtle footsteps and radio static.",
"duration": 8,
"quality": "720p",
"aspect_ratio": "16:9",
"generate_audio": true,
"callback_url": "https://your-domain.com/webhooks/gptimage2-task"
}'
Expected behavior
The API returns immediately with a task object containing at least:
id(task id)statususagefields (e.g.,billing_rule,credits_reserved)
2) Poll task status (cURL)
curl --request GET \
--url https://api.gptimage2/v1/tasks/<task_id> \
--header 'Authorization: Bearer <token>'
When completed, the task includes:
resultsarray with output URLsstatusandprogress
callback_url: reliability rules (do not skip)
GPTImage2 can call your callback_url when a task is completed / failed / cancelled.
Docs constraints and retry policy:
- HTTPS only
- Internal/private network IPs blocked
- Timeout: 10 seconds
- Max 3 retries with backoff at 1 / 2 / 4 seconds
- Callback body aligns with task query response format
Callback production checklist (copy/paste)
- Respond 2xx within 200ms–500ms (enqueue work; don't do heavy work inline)
- Validate
task_idexists and belongs to your tenant/user - Immediately re-query
GET /v1/tasks/{task_id}before marking final state - Deduplicate callbacks (store
task_id+ final status; ignore repeats) - Log raw callback payload for debugging (redact secrets)
- Alert when callback failure rate increases (usually indicates your endpoint issues)
Idempotency: prevent double-processing (and "pay twice" behavior)
Even if your provider is correct, your system can still create duplicates because:
- Users double-click "Generate"
- Mobile networks retry
- Gateways time out
Recommended pattern
- Client generates
idempotency_keyper user intent (one click = one key) - Server stores
(user_id, idempotency_key)→task_idwith TTL - On repeats, return the same
task_idinstead of creating a new task
Don't assume idempotency is "handled for you" unless the API explicitly documents it. Implement it at your application edge.
Asset delivery: the 24-hour trap
Since output links expire in 24 hours, your pipeline should:
- Download the result immediately when
status=completed - Re-host in your object storage (S3/GCS/R2)
- Serve via CDN
- Persist metadata:
task_id, prompt hash, user id, duration, quality, audio flag, moderation outcome category
Common failure you'll see: "User comes back tomorrow, link expired."
Prevent it by re-hosting automatically on completion.
Cost modeling without price snapshots (still actionable)
Even if you never publish pricing numbers, you still need a durable unit economics model.
1) Know what the platform bills on
Operationally, your spend is driven by:
duration(longer clip → more expensive)generate_audio(audio adds cost)- Iteration (users rarely get it right the first time)
2) Budget "cost per successful output," not "cost per attempt"
Track:
attempts_per_success(by scenario)retry_ratepolicy_failure_ratep95_latency
Then your real unit:
successful outputs per session × average attempts needed
3) Draft → Approve → Final (most reliable spend reducer)
When you see high iteration:
- Draft using cheaper settings (e.g., lower quality/shorter duration) or a cheaper tier/model
- Finalize with Seedance 1.5 Pro (with audio) only after user approves
No fixed % promises—just a predictable strategy that improves with iteration-heavy usage.
Production pitfalls & solutions
1) The async trap (timeouts + zombie jobs)
Do not keep a single HTTP request open. Always return task_id immediately and finish via callback/polling.
Best practice
- Set a "job TTL" and a "still processing" UI state
- Track p95 completion time; degrade gracefully when it spikes
2) Moderation outcomes can arrive late
Design UI/backend states for "failed after processing."
- Separate: policy vs transient vs internal
- Never auto-retry policy failures
- Provide prompt rewrite guidance (especially around sensitive content)
3) Storage is part of your API
Binary is heavy:
- Don't stream through your gateway
- Download → store → CDN


Comparison: Seedance 1.5 Pro vs Veo 3.1 vs Kling O1 (no price snapshots)
Numbers age fast. Compare what lasts: accounting shape, integration surface, and workflow fit.
Table A — Integration & accounting shape
| Dimension | Seedance 1.5 Pro (via GPTImage2) | Veo 3.1 | Kling O1 |
|---|---|---|---|
| Accounting unit | Per-call driven by duration/audio + usage/credits fields | Commonly duration-first accounting | Varies by access route (plans/credits/wrappers) |
| Integration contract | Async task + callback/polling | Async job patterns are common | Varies widely by provider/wrapper |
| Native audio-video | Supported via generate_audio | Native audio often positioned as core | Depends on access route/version |
| Operational predictability | Strong if you re-host within 24h and enforce idempotency | Strong when ecosystem + contract are stable | Depends on access semantics and provider fragmentation |
| Best fit | Audio-critical short clips + first/last-frame control | Duration-first budgeting + Google ecosystem | Editing/restyle-first products |
Conclusion: Seedance via GPTImage2 is best when you want one stable async contract and audio-critical output. Veo is attractive when you prefer duration-first budgeting. Kling O1 shines when editing/restyle is the product core.
Table B — Production decision matrix
| If your priority is… | Seedance 1.5 Pro (via GPTImage2) | Veo 3.1 | Kling O1 |
|---|---|---|---|
| One API for text/image/first-last | Yes (image_urls length) | Depends on endpoints | Depends on provider |
| Reliable callbacks | Defined retry semantics | Provider dependent | Provider dependent |
| Asset ops predictability | Requires re-hosting within 24h | Provider dependent | Provider dependent |
| Generate + edit workflows | Not the primary positioning | Not the primary positioning | Often the differentiator |
| Lowest integration complexity | High (single endpoint + task API) | High if already in ecosystem | Medium–Low if semantics fragmented |
Conclusion: If you value production reliability with minimal adapters, Seedance via GPTImage2 is straightforward. If editing is your center of gravity, Kling O1 is worth the integration overhead. If budgeting simplicity is your priority, Veo is a clean mental model.
Decision checklist (fast yes/no)
Use Seedance 1.5 Pro via GPTImage2 if:
- You need native audio and can structure dialogue in quotes
- You need text-to-video + image-to-video + first-last-frame under one contract
- You can re-host assets within 24 hours
Use Veo 3.1 if:
- You prefer duration-first budgeting and a stable cloud ecosystem workflow
Use Kling O1 if:
- Editing/restyling is central and you've confirmed stable access semantics
FAQ
Q: How do I switch between text-to-video and image-to-video on GPTImage2?
Use image_urls. 0 images = text-to-video, 1 = image-to-video, 2 = first-last-frame.
Q: Webhook or polling—what's safer?
Use callback_url when possible. Still re-query task status before marking final state.
Q: Why must I re-host the result?
Links expire in 24 hours. Download and store outputs promptly.
Q: What's the right unit for unit economics?
Cost per successful output, not per attempt. Track attempts-per-success, retries, and policy failure rate.
Q: How do I avoid duplicate tasks when users retry?
Implement idempotency per user intent: (user_id, idempotency_key) → task_id.
Start Building with Seedance 1.5 Pro Today
You've seen the contract. You understand the tradeoffs. Now turn it into a production feature.
GPTImage2 gives you a clean path from prompt → task → delivery:
- One API key for Seedance 1.5 Pro, Veo 3.1, Kling, and other models
- Async tasks + callbacks with defined retry semantics
- Usage-based billing with transparent credits and no minimums
Most teams integrate in under an hour.
