Concepts

Generation Modes

ima2-gen gives you five ways to make and refine an image. Pick by how you iterate.

Classic

One prompt, one strong frame. Write a prompt, attach up to five references, pick model, quality, size, format, and moderation, then generate. Copy, download, continue from the result, or send it into Canvas Mode.

Node

One frame, ten directions. Lock a parent image and fan out children — palette, framing, copy. Each node keeps its own prompt and result; root nodes attach local references while child nodes use the parent image as their source. Completed jobs match back to nodes by request ID, so reloads and graph-version conflicts can recover finished results.

Multimode

Several candidates from one prompt. Run a sequence from Classic mode, watch each slot progress, cancel when needed, and continue from the strongest result. From the CLI this streams phase, partial, and image events.

Canvas Mode

Clean up the winning frame. Pan a zoomed image separately from selection, annotate target areas, clean backgrounds with a previewed mask, and export transparent (alpha) or matte-backed versions. Saved canvas versions stay hidden from the gallery but can be reused as the next reference.

Video

Generate short videos from text, a single image, or multiple reference images via Grok video models (grok-imagine-video, grok-imagine-video-1.5-preview). Mode is auto-detected: 0 refs → text-to-video, 1 ref → image-to-video, 2–7 refs → reference-to-video (max 10s). Controls include duration (1–15s), resolution (480p, 720p), and aspect ratio. The CLI streams SSE progress events; the UI shows planning → generating → X% progress in the in-flight queue.

Agent

Describe what you want and let the agent iterate. Agent Mode is a conversational image workspace: each session keeps a current image, a turn history, style/subject locks, and a durable queue, so parallel and auto-generation work survives reconnects. It supports slash commands and /question, and runs in the web UI (there is no ima2 agent CLI command). With provider: "grok", Agent Mode uses the same search + planner + xAI Images API path as Classic and Node.

Prompt library & imports

Fill the prompt library from local files, GitHub folders, curated sources, and GPT-image hint packs. Imported prompts are indexed locally so search and ranking work without re-importing every session. See the Prompt Studio manual for control-by-control detail.