Imagen 3 arrives in the Gemini API

Developers can now access Imagen 3, Google’s state-of-the-art image generation model, through the Gemini API. The model will be initially accessible to paid users, with a rollout to the free tier coming soon.

Imagen 3 excels in producing visually appealing, artifact-free images in a wide variety of styles from hyperrealistic images to impressionistic landscapes, abstract compositions to anime characters. Improved prompt following makes it easy to convert great ideas into high-quality images. Overall, Imagen 3 achieves state-of-the-art performance on the variety of benchmarks. Imagen 3 achieves this while being priced at $0.03 per image on the Gemini API, with control over aspect ratios, the number of options to generate, and more.

To help combat misinformation and misattribution, all images generated by Imagen 3 include a non-visible digital SynthID watermark, identifying them as AI-generated.

See Imagen 3 in Action

The gallery below highlights Imagen 3’s capabilities across a range of styles.

Get Started with Imagen 3 in the Gemini API

This Python code snippet demonstrates how to generate an image with Imagen 3 using the Gemini API.

from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO

client = genai.Client(api_key='GEMINI_API_KEY')

response = client.models.generate_images(
    model='imagen-3.0-generate-002',
    prompt='a portrait of a sheepadoodle wearing cape',
    config=types.GenerateImagesConfig(
        number_of_images=2,
    )
)
for generated_image in response.generated_images:
  image = Image.open(BytesIO(generated_image.image.image_bytes))
  image.show()

You can explore more prompting advice and image styles in the Gemini API developer docs, with further details available on scores, methodology, and performance improvement in Appendix D of our updated technical report.

We are excited to take the first step of expanding availability of our generative media models into the Gemini API and plan to make more available in the near future so that developers can bridge generative media and language models together.

Source link

See Imagen 3 in Action

Get Started with Imagen 3 in the Gemini API

Related Posts