Concepts
Generation Modes
ima2-gen gives you five ways to make and refine an image. Pick by how you iterate.
Classic
One prompt, one strong frame. Write a prompt, attach up to five references, pick model, quality, size, format, and moderation, then generate. Copy, download, continue from the result, or send it into Canvas Mode.
Node
One frame, ten directions. Lock a parent image and fan out children — palette, framing, copy. Each node keeps its own prompt and result; root nodes attach local references while child nodes use the parent image as their source. Completed jobs match back to nodes by request ID, so reloads and graph-version conflicts can recover finished results.
Multimode
Several candidates from one prompt. Run a sequence from Classic mode, watch each slot progress,
cancel when needed, and continue from the strongest result. From the CLI this streams
phase, partial, and image events.
Canvas Mode
Clean up the winning frame. Pan a zoomed image separately from selection, annotate target areas, clean backgrounds with a previewed mask, and export transparent (alpha) or matte-backed versions. Saved canvas versions stay hidden from the gallery but can be reused as the next reference.
Video
Generate short videos from text, a single image, or multiple reference images via Grok video
models (grok-imagine-video, grok-imagine-video-1.5-preview). Mode is
auto-detected: 0 refs → text-to-video, 1 ref → image-to-video, 2–7 refs → reference-to-video
(max 10s). Controls include duration (1–15s), resolution (480p, 720p), and aspect ratio. The
CLI streams SSE progress events; the UI shows planning → generating → X% progress in the
in-flight queue.
Agent
Describe what you want and let the agent iterate. Agent Mode is a conversational image workspace:
each session keeps a current image, a turn history, style/subject locks, and a durable queue, so
parallel and auto-generation work survives reconnects. It supports slash commands and
/question, and runs in the web UI (there is no ima2 agent CLI command).
With provider: "grok", Agent Mode uses the same search + planner + xAI Images API
path as Classic and Node.
Prompt library & imports
Fill the prompt library from local files, GitHub folders, curated sources, and GPT-image hint packs. Imported prompts are indexed locally so search and ranking work without re-importing every session. See the Prompt Studio manual for control-by-control detail.