Skip to content

Google Image generation support#6408

Open
OreOreDa wants to merge 13 commits into
spring-projects:mainfrom
OreOreDa:GH-2133
Open

Google Image generation support#6408
OreOreDa wants to merge 13 commits into
spring-projects:mainfrom
OreOreDa:GH-2133

Conversation

@OreOreDa

Copy link
Copy Markdown

Closes GH-2133

  • MR updated with the latest GenAI sdk, and the new options for image generation
  • Use of the new generate content API
  • Documentation and tests updated
  • Tested with "gemini-2.5-flash-image" in my project

Thanks !

Signed-off-by: Olivier LE-QUELLEC <olivier.le-quellec@renault.com>
@sdeleuze

Copy link
Copy Markdown
Contributor

@ddobrin Could you please review this PR?

@OreOreDa

Copy link
Copy Markdown
Author

@ddobrin @sdeleuze

I added the image editing feature of GenAI - the medias have been added in ImageMessage, as for UserMessage
Thanks !

@ddobrin

ddobrin commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Thanks for this PR, @OreOreDa — it's a substantial contribution.
Basic aspects are solid: the google-genai 1.58.0 SDK usage is almost the latest and can be easily replaced with 1.60.0, retry/observation follow the current conventions, and making ImageMessage implement MediaContent is the right way to carry image input.

I'd like to make a few notes, some in code and some in docs: for example using professional names as in replace "Nano Banana", "Nano Banana Pro", and "Nano Banana 2" with professional descriptions, such as "Gemini 2.5 Flash Image Model", "Gemini 3 Pro Image Model", and "Gemini 3.1 Flash Image Model".

🔴 Fix before merge

  1. Per-request options silently override a model-level configured model. Because the GoogleGenAiImageOptions constructor always bakes in a non-null default model and buildImagePrompt then applies from(googleOptions) unconditionally, a user who configures e.g. gemini-3-pro-image on the model bean and passes a per-request options object that didn't call .model(...) is silently downgraded to the default. See inline on GoogleGenAiImageModel.java and GoogleGenAiImageOptions.java.

  2. Vertex AI credentials-uri is silently ignored. The GoogleCredentials are loaded and then never passed to the connection builder (so they never reach the Client), meaning the URI a user supplies has no effect — and it's documented as working. See inline on GoogleGenAiImageConnectionAutoConfiguration.java.

  3. A unit test should lock in the merge fix — there's currently no test that exercises option merge with a model-level default present, which is exactly why Fix errors and omissions in docs #1 slipped through.

🟠 Should fix

  • Response text parts are silently dropped — a refusal/finish-reason returned as text yields an empty result with no explanation.
  • ImageMessage.equals/hashCode/toString ignore the new media/metadata fields, and media is non-final with getMedia() exposing the internal list — diverges from UserMessage/AssistantMessage.
  • Two @ConfigurationProperties classes bind the same prefix spring.ai.google.genai.image.
  • Docs: the page is not registered in nav.adoc/imageclient.adoc; the property table uses a .options. prefix that won't bind; and the Manual Configuration samples call imageModel.call("..."), which doesn't compile.
  • Test: imageModelActivation is gated behind GOOGLE_API_KEY it doesn't actually use, so this wiring check is skipped in CI.

🟡 Check (minor issues)

Metadata is effectively always empty (gcsUri/enhancedPrompt/raiFilteredReason always null; empty ImageResponseMetadata); ncandidateCount doesn't reliably yield N images; PERSON_GENERATION_UNSPECIFIED is sent unguarded while the safety UNSPECIFIED is correctly suppressed; Builder.from() is a conditional copy and copies labels by reference; getResponseFormat() is repurposed to carry a MIME type; GoogleGenAiImageOptions has no equals/hashCode/toString; connection autoconfig isn't gated by spring.ai.model.image; a few stale doc values (n default, broken GitHub source link, SAFETY_FILTER_LEVEL_UNSPECIFIED), and the BOM lost its trailing newline. Details inline / happy to expand on any.

What's good

Correct SDK type bindings (verified against 1.58.0, should upgrade to 1.60.0), the four BLOCK_* levels match the SDK enum and ..._UNSPECIFIED is guarded, retry via org.springframework.core.retry.RetryTemplate matches the chat module, observation parity with OpenAI, a backward/binary-compatible core change, clean and correctly-sorted BOM/starter/module wiring, and good branch-level retry/media unit coverage.

@ddobrin

ddobrin commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Part II - code comments

models/.../image/GoogleGenAiImageModel.java — lines 193–200 (buildImagePrompt)

🔴 Merge precedence bug. builder.from(this.options) (L193) seeds the model-level config, but builder.from(googleOptions) (L200) then re-applies the per-request options unconditionally. Combined with the constructor below always baking in a default model, a request whose options didn't set .model(...) clobbers the model-level model. Suggest the OpenAI pattern: a single full copy of this.options followed by a single null-guarded merge(runtimeOptions); drop the second from() and the duplicated explicit model/n/outputMimeType lines.

models/.../image/GoogleGenAiImageOptions.java — line 132 (constructor)

🔴 Root cause of the merge bug. this.model = (model != null ? model : DEFAULT_MODEL_NAME) forces a non-null model into every options instance, so "unset" becomes indistinguishable from "set to the default" and null-skip merge can't tell which layer should win. Either don't bake the default here, or ensure only one merge path applies it.

auto-configurations/.../image/GoogleGenAiImageConnectionAutoConfiguration.java — lines 64–66

🔴 Dead credentials / silent failure. The GoogleCredentials loaded here are never passed to the GoogleGenAiImageConnectionDetails builder (only apiKey/projectId/location are), so they never reach the Client and credentials-uri has no effect. Add credentials(GoogleCredentials) to GoogleGenAiImageConnectionDetails.Builder, thread it into Client.Builder.credentials(...) (available in 1.58.0), and load via try-with-resources as the chat autoconfig does. Please also remove the misleading comment on L66 and apply the same fix to the embedding sibling for parity.

models/.../image/GoogleGenAiImageModel.java — lines 156, 216

🟠 Text parts silently discarded. responseModalities("TEXT","IMAGE") (L216) requests text, but only Part::inlineData parts are mapped (L156). When the model returns a refusal/finish-reason as text and no image, the caller gets an empty result with no signal. Either request IMAGE only, or surface the text into ImageResponseMetadata/per-generation metadata.

models/.../image/GoogleGenAiImageModel.java — lines 163, 165, 172

🟡 Metadata always empty. new Image(null, b64Json) (L163) then feeding image.getUrl() into the metadata (L165) means gcsUri (and enhancedPrompt/raiFilteredReason) are always null, and ImageResponseMetadata (L172) is empty. Consider populating what the SDK exposes (model name, etc.).

models/.../image/GoogleGenAiImageModel.java — line 132

🟡 Uses the original prompt, not the built one. call() builds imagePrompt (L111) but L132 iterates prompt.getInstructions() (the original arg). Harmless today since instructions are copied through, but fragile — prefer imagePrompt.getInstructions().

spring-ai-model/.../image/ImageMessage.java — lines 37, 80–94

🟠 equals/hashCode/toString ignore the new state, and media is mutable. equals (L80) / hashCode (L91) use only text+weight, and toString (L75) omits media/metadata, so two messages differing only in attached media compare equal. The media field (L37) is non-final and getMedia() returns the internal list. Suggest including both fields in equals/hashCode/toString and making media final, to match AssistantMessage. Also missing @since/Javadoc on the new public API (4-arg ctor, getMedia, getMetadata).

auto-configurations/.../image/GoogleGenAiImageConnectionProperties.java — line 33 & GoogleGenAiImageProperties.java — line 35

🟠 Duplicate prefix. Both classes declare CONFIG_PREFIX = "spring.ai.google.genai.image". Two @ConfigurationProperties on the same namespace is ambiguous — follow the embedding precedent and nest the model options (e.g. spring.ai.google.genai.image.options) or split connection onto the shared spring.ai.google.genai prefix. (Note vertexAi at ConnectionProperties:59 is bound but never read.)

spring-ai-docs/.../nav.adoc — after line 54

🟠 Page is orphaned. Image entries live at L52–54 but Google GenAI isn't listed (nor in imageclient.adoc "Available Implementations", L158–163), so it won't appear in the docs nav.

**** xref:api/image/stabilityai-image.adoc[Stability]
**** xref:api/image/google-genai-image.adoc[Google GenAI]

spring-ai-docs/.../api/image/google-genai-image.adoc — line 116 (and the whole table, L116–129)

🟠 Property prefix won't bind. The model-options table uses a .options. segment, but the binding is flat (spring.ai.google.genai.image.n, matching your own Sample Controller). Remove .options. from every row. Example for L116:

| spring.ai.google.genai.image.n | The number of images to generate. | -

spring-ai-docs/.../api/image/google-genai-image.adoc — lines 223–224 (and 243–244)

🟠 Sample doesn't compile. ImageModel only declares call(ImagePrompt) (no call(String)).

ImageResponse imageResponse = imageModel
	.call(new ImagePrompt("A painting of a sunset over a mountain", options));

spring-ai-docs/.../api/image/google-genai-image.adoc — line 183

🟡 Broken source link. The GitHub URL omits the /image/ package segment, so it 404s. Should point to .../org/springframework/ai/google/genai/image/GoogleGenAiImageModel.java.

auto-configurations/.../image/GoogleGenAiImageAutoConfigurationIT.java — lines 80–81

🟠 Wiring test skipped in CI. imageModelActivation only uses a fake test-key (L83) but is gated by @EnabledIfEnvironmentVariable(GOOGLE_API_KEY) (L80), so this conditional-bean check never runs without real credentials. Remove the gate (or split it out) so it runs as a plain context test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: adding Vertex Ai Imagen models support

4 participants