, or HomeHow Vivid’s Cloud Pipeline Keeps Desktop, Modal, and Convex in Sync
Vivid cloud path from desktop queue through Modal GPUs to Convex job state
How Vivid’s Cloud Pipeline Keeps Desktop, Modal, and Convex in SyncPublished Mar 21, 2026 · Updated Mar 21, 2026From queue routing and run grants to GPU containers and real-time job state—how Vivid offloads inference to Modal without forking the local pipeline story.

Vivid can run the same style of jobs locally (Rust + bundled Python + VapourSynth) or in the cloud on Modal GPUs. The product promise is simple: pick a mode per queue item and still get a predictable outcome file on disk.

The engineering story is about trust boundaries: the UI must not pretend a cloud run is allowed until the server says so; the remote container must not execute arbitrary scripts; progress has to reach the Processing page even when stdout is noisy; cancellation has to stop billing-shaped work, not just hide a spinner.

This post walks through how that architecture is wired in the repo.

Queue routing: one processor, two execution paths

Cloud is not a separate app—it is a branch inside the same queue orchestration. When useModal is true on a queue item, inference entry points build a config that mirrors the local pipeline payload and call Modal instead of Tauri’s local pipeline commands.

mermaid

Preflight logic (for example cloud quota estimates from shared cloud benchmarks) can block starting the queue before work begins; authoritative quota and entitlements still apply when a run grant is issued at job start (authorizeCloudRun in Convex).

End-to-end flow: grant, bridge, container, file

A cloud job is intentionally boring in the middle: upload bytes, run the same class of vspipe | ffmpeg work as locally, return bytes. The interesting parts are gates and telemetry.

mermaid

Key files to read alongside this diagram:

  • src/renderer/services/cloudExecutor.tsexecuteOnModal, grant acquisition, Modal subprocess wiring, progress forwarding.
  • src/cloud/call_modal_function.py — reads local video, calls modal.Function.from_name("vivid-inference", "run_inference_<gpu>"), streams logs, writes the result file, emits completion JSON.
  • src/cloud/modal_app.py and src/cloud/modal_runtime/ — payload validation, grant consumption, inference implementation, cancellation polling, optional HTTP API.

Payload contract: server-side truth wins

The cloud runtime does not “trust” the desktop blindly. Before execution it enforces a strict contract (summarized in docs/cloud-processing.md): allowlisted engines and scripts, normalized backends (for example TensorRT for most cloud engines, specific PyTorch-CUDA engines where required), hard limits on upload/model/options size, clamped streams and interpolation multi, FFmpeg safety (container/codec/flag allowlists), and a runtime cap checked in the container.

That pattern keeps failures predictable—either the job is rejected early with a clear policy reason, or it runs inside a bounded sandbox.

mermaid

Progress: Convex as the source of truth for the UI

Cloud jobs use two channels for progress:

  1. Convex real-time — the container writes signed updates to internal HTTP routes (/jobs/init, /jobs/progress, /jobs/complete). The renderer subscribes with useQuery(api.jobs.getJob, { jobId }) (see useCloudJobSync).
  2. Stdout JSON lines — the bridge forwards Modal log lines; the queue processor can parse progress and emit Tauri events as a fallback.

Dual channels exist so the Processing page stays responsive even when one path degrades.

Cancellation: local kill + remote intent

Stopping a cloud job is coordinated:

  1. Terminate the local bridge process (kill_modal_cloud / Tauri).
  2. Mark cancel intent on the job in Convex (cancelRequested).
  3. The Modal runtime polls job status and tears down vspipe / ffmpeg when cancel is observed.

That way “Stop” means stop work, not only stop listening.

Same catalog, different metal

Cloud workers still honor the shared engine_runtime_catalog.json (mounted or copied into the image; see VIVID_CATALOG_PATH in Modal image setup). So backend policy and dependency expectations stay aligned with the desktop story described in the engine-catalog post—only the machine underneath changes.

HTTP API and vivid-web

The Modal app also exposes POST /api/process for multipart uploads (used by vivid-web and integrations), with bearer auth, the same grant token requirement, and parallel hardening (rate limits and proxy signing from Next.js where applicable). Public job status endpoints stay off by default; turn them on only with an explicit env flag when you need them.

This is one layer of Vivid’s systems story: local-first control, optional cloud scale, and server-enforced rules so the experience stays honest end to end.