Concepts

Providers & Models

Image generation runs through one of three provider paths: your local Codex/ChatGPT OAuth login, a configured OpenAI API key, or the bundled Grok/progrok xAI path.

Provider paths

  • provider: "oauth" — uses the local Codex OAuth proxy. The default path; no API key needed.
  • provider: "api" — calls the OpenAI Responses API with the hosted image_generation tool. Requires OPENAI_API_KEY.
  • provider: "grok" — starts bundled progrok, runs mandatory xAI Web Search and a grok-4.3 planner, then calls xAI Images API.

OAuth, API-key, and Grok generation cover Classic, Node, and Agent Mode. Agent Mode is web-UI only; Classic and Node also have CLI commands.

Per-request override

Generation commands accept --provider <auto|oauth|api|grok>:

ValueBehavior
autoPreserve route default behavior; currently resolves to OAuth.
oauthForce the local OAuth proxy path.
apiForce the API-key Responses path; requires a configured key.
grokForce the bundled xAI path through 127.0.0.1:18645; run ima2 grok login once to authorize.

Models

The app defaults to gpt-5.4-mini for fast local iteration. Switch to gpt-5.4 for the safest balanced workflow.

ModelUse
gpt-5.4-miniCurrent default. Faster draft model.
gpt-5.4Recommended balanced choice.
gpt-5.5Strongest quality when your Codex CLI/OAuth backend supports it. May use more quota or need an updated Codex CLI.
grok-imagine-imageDefault Grok image model.
grok-imagine-image-qualityHigher quality Grok image model; also selected by high-quality Grok Node requests.
grok-imagine-videoDefault Grok video model (T2V/I2V).
grok-imagine-video-1.5-previewPreview Grok video model with improved quality.

The app also exposes quality (low, medium, high) and moderation (auto, low) controls. Reasoning effort accepts none, low, medium, high, and xhigh.

Persisted defaults. ima2 defaults set model gpt-5.5 and ima2 defaults set reasoning high write both OAuth and API-provider default keys, so your "default model" stays one concept across provider paths.

Grok pipeline

Grok Classic, Node, and Agent requests run a three-step pipeline: mandatory xAI Web Search, grok-4.3 planning with an English final image prompt, then xAI image creation. Text-only requests use /v1/images/generations; requests with reference images, a Node parent image, or an Agent current image use /v1/images/edits so image-to-image context is preserved. Grok accepts up to three total input images in this path.

ima2 maps OpenAI-style sizes to xAI aspect_ratio and resolution controls. Grok mask edit is not wired in this release and returns GROK_MASK_UNSUPPORTED.

Grok video

Grok video generation uses grok-imagine-video (default) or grok-imagine-video-1.5-preview. Three modes are auto-detected from reference count: text-to-video (0 refs), image-to-video (1 ref), and reference-to-video (2–7 refs, max 10s). Controls include duration (1–15s), resolution (480p, 720p), and aspect ratio. The endpoint POST /api/video/generate streams SSE events: planning → submitted → progress → done. From the CLI: ima2 video "prompt" --duration 5 --resolution 720p.

API-provider defaults

When the API path is used without explicit options, these defaults apply:

VariableDefault
IMA2_API_IMAGE_MODEL_DEFAULTgpt-5.4-mini
IMA2_API_REASONING_EFFORTlow
IMA2_API_IMAGE_SIZE1024x1024
IMA2_API_ALLOW_WEB_SEARCHtrue

See Configuration for the full environment table.