Image Generation - LibreChat

Overview

LibreChat supports multiple image generation engines that can be used by AI agents to create images based on text descriptions. Agents can generate images as part of conversations or workflows.

Supported Engines

DALL-E
OpenAI Image Tools
Flux
Stable Diffusion
Gemini

OpenAI’s DALL-E 2 and DALL-E 3 models.

# .env
DALLE_API_KEY=your-openai-key
DALLE3_API_KEY=your-openai-key  # Can be separate

# librechat.yaml (optional customization)
# DALLE3_SYSTEM_PROMPT="Custom system prompt"
# DALLE3_BASEURL="https://api.openai.com/v1"

Customizable OpenAI image generation for agents:

# .env
IMAGE_GEN_OAI_API_KEY=your-key
IMAGE_GEN_OAI_MODEL=gpt-image-1
# IMAGE_GEN_OAI_BASEURL=custom-url

Custom descriptions:

IMAGE_GEN_OAI_DESCRIPTION="Generate images with custom AI"
IMAGE_GEN_OAI_PROMPT_DESCRIPTION="Custom prompt description"
IMAGE_EDIT_OAI_DESCRIPTION="Edit images with AI"

Black Forest Labs Flux models for high-quality generation.

# .env
FLUX_API_KEY=your-flux-key
FLUX_API_BASE_URL=https://api.us1.bfl.ai  # or https://api.bfl.ml

Get your API key at api.us1.bfl.ai

Self-hosted or cloud-based Stable Diffusion WebUI.

# .env
SD_WEBUI_URL=http://host.docker.internal:7860

For Docker setups, use host.docker.internal to access the host machine.

Google’s Gemini models with image generation capability.

# .env
GEMINI_API_KEY=your-gemini-key  # Or use GOOGLE_KEY
GEMINI_IMAGE_MODEL=gemini-2.5-flash-image

# librechat.yaml (for Vertex AI)
# GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json
# GOOGLE_LOC=us-central1

Configuration

Choose Image Provider

Select one or more image generation engines based on your needs:

DALL-E: Best integration, commercial use allowed
Flux: High quality, flexible licensing
Stable Diffusion: Self-hosted, fully customizable
Gemini: Multimodal integration with Google models

Configure API Keys

Add the necessary API keys to your .env file:

# .env - Choose the engines you want
DALLE3_API_KEY=your-openai-key
FLUX_API_KEY=your-flux-key
SD_WEBUI_URL=http://localhost:7860
GEMINI_API_KEY=your-gemini-key

Enable for Agents

Image generation is automatically available to agents when configured. No additional settings needed in librechat.yaml.

Using Image Generation

In Conversations

Direct Request
As Part of Workflow
With Specific Style

Generate an image of a serene mountain landscape at sunset
with a lake in the foreground.

Create a marketing campaign for our new product:
Write a tagline
Generate a hero image showing the product in use
Draft social media copy

Generate an image in the style of impressionist painting:
A Parisian café scene with people enjoying coffee.

Image Generation Tool

When an agent has image generation enabled, it uses the tool automatically:

// Agent automatically calls image generation tool
{
  tool: "generate_image",
  parameters: {
    prompt: "A serene mountain landscape at sunset",
    size: "1024x1024",
    quality: "hd"
  }
}

Generation Progress

During image generation, users see a progress indicator:

// Progress tracking component
<ImageGen
  initialProgress={0.1}
  args="Creating Image..."
/>

The progress animation shows:

Visual spinner
Status text (“Creating Image”, “Finished”)
Progress percentage

File Configuration

Control image generation settings:

# librechat.yaml
fileConfig:
  imageGeneration:
    percentage: 100  # Scale to percentage of original
    # OR
    px: 1024  # Fixed pixel size

Advanced Options

DALL-E Azure Integration

Use DALL-E through Azure OpenAI:

# .env
DALLE3_AZURE_API_VERSION=2024-02-15-preview
DALLE2_AZURE_API_VERSION=2024-02-15-preview

Custom DALL-E Endpoints

# .env
DALLE_REVERSE_PROXY=https://your-proxy.com
DALLE3_BASEURL=https://custom-dalle.com
DALLE2_BASEURL=https://custom-dalle.com

Stable Diffusion Configuration

For self-hosted Stable Diffusion WebUI:

Launch Automatic1111 WebUI with API enabled:
```
./webui.sh --api --listen
```
Configure the URL:
```
SD_WEBUI_URL=http://localhost:7860
```

Gemini Authentication

Multiple authentication methods:

API Key (AI Studio)

GEMINI_API_KEY=your-key

Default, easiest for development.

Shared Google Key

GOOGLE_KEY=your-key

Shared with Google chat endpoint.

Vertex AI Service Account

GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json
GOOGLE_LOC=us-central1

For production deployments.

Image Specifications

DALL-E

DALL-E 2: 256x256, 512x512, 1024x1024
DALL-E 3: 1024x1024, 1792x1024, 1024x1792
Formats: PNG
Quality: Standard or HD (DALL-E 3)

Flux

Resolutions: Up to 2048x2048
Formats: PNG, JPEG
Models: Flux Pro, Flux Dev, Flux Schnell

Stable Diffusion

Fully customizable based on your model
Common: 512x512, 768x768, 1024x1024
Supports various samplers and schedulers

Gemini

Model: gemini-2.5-flash-image
Integrated with Gemini multimodal capabilities
Supports various aspect ratios

Troubleshooting

Image generation fails

Verify API keys are correct and active
Check API quota/billing status
Ensure network connectivity to image service
Review error messages in logs

Stable Diffusion not connecting

Verify WebUI is running with --api flag
Check firewall settings
Use host.docker.internal for Docker setups
Confirm URL includes port number

DALL-E 3 prompts rejected

DALL-E 3 has content policy restrictions
Avoid restricted content (violence, adult, etc.)
Rephrase prompts if rejected
Check OpenAI status page for outages

Images not displaying

Check file storage configuration
Verify browser can access image URLs
Check file size limits in fileConfig
Clear browser cache

Best Practices

Detailed prompts: More specific descriptions produce better results
Style keywords: Include art style, lighting, perspective
Iterate: Refine prompts based on results
Multiple engines: Different engines excel at different styles
Cost awareness: Monitor API usage, especially for DALL-E 3

Example Prompts

Good: "A photorealistic portrait of a golden retriever puppy
playing in autumn leaves, soft natural lighting, shallow depth
of field, professional photography"

Poor: "dog picture"

​Overview

​Supported Engines

​Configuration

​Using Image Generation

​In Conversations

​Image Generation Tool

​Generation Progress

​File Configuration

​Advanced Options

​DALL-E Azure Integration

​Custom DALL-E Endpoints

​Stable Diffusion Configuration

​Gemini Authentication

​Image Specifications

​DALL-E

​Flux

​Stable Diffusion

​Gemini

​Troubleshooting

​Best Practices

​Example Prompts

​Related Features

Overview

Supported Engines

Configuration

Using Image Generation

In Conversations

Image Generation Tool

Generation Progress

File Configuration

Advanced Options

DALL-E Azure Integration

Custom DALL-E Endpoints

Stable Diffusion Configuration

Gemini Authentication

Image Specifications

DALL-E

Flux

Stable Diffusion

Gemini

Troubleshooting

Best Practices

Example Prompts

Related Features