Skip to content

#6040 - Improve LLM infrastructure#6041

Merged
reckart merged 12 commits into
mainfrom
refactoring/6040-Improve-LLM-infrastructure
Jun 7, 2026
Merged

#6040 - Improve LLM infrastructure#6041
reckart merged 12 commits into
mainfrom
refactoring/6040-Improve-LLM-infrastructure

Conversation

@reckart

@reckart reckart commented May 17, 2026

Copy link
Copy Markdown
Member

What's in the PR

  • Introduces a provider-neutral LLM chat client layer (LlmChatClient) with request/response DTOs (ChatMessage, ChatOptions, ChatResult, ChatChunk, ToolCall/ToolDescriptor, ModelInfo, UsageInfo, FinishReason) and an extension point for registering client implementations.
  • Adds adapters that route Ollama, ChatGPT/OpenAI-compatible, and Azure OpenAI through this shared abstraction, including chat, streaming, embeddings, model discovery, and tool-call bridging.
  • Refactors the existing Ollama/ChatGPT/Azure recommenders, factories, and traits editors to go through the common client + a shared ChatBasedLlmRecommenderImplBase instead of provider-specific exchange logic.
  • Normalizes auth handling: centralized per-adapter apiKey() helpers; Azure fails fast on missing key, while OpenAI-compatible/Ollama allow no-auth and omit the Authorization header when no key is set.
  • Tightens extension-point ID handling (enforces unique IDs, with an opt-out for ordered editor AJAX request handlers).
  • Filters the interactive recommender sidebar to LLM-backed recommenders and deprecates the assistant recommender factory to hide it from user creation.
  • Adds/updates integration tests for the new adapters and extension points.

How to test manually

  • No specific test procedure

Automatic testing

  • PR includes unit tests

Documentation

  • PR updates documentation

@reckart reckart added this to the 41.0 milestone May 17, 2026
@reckart reckart self-assigned this May 17, 2026
@reckart reckart added this to Kanban May 17, 2026
@github-project-automation github-project-automation Bot moved this to 🔖 To do in Kanban May 17, 2026
@reckart reckart requested a review from Copilot May 17, 2026 20:53

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a provider-neutral LLM chat client layer and refactors existing LLM recommenders to route through it, while tightening extension-point ID handling and adjusting recommender/sidebar behavior.

Changes:

  • Added LlmChatClient abstractions, request/response records, extension point, and Ollama/OpenAI/Azure adapters.
  • Refactored Ollama, ChatGPT, and Azure OpenAI recommenders/factories/editors to use the shared client infrastructure.
  • Added/updated integration tests and tightened extension ID uniqueness with an opt-out for editor AJAX handlers.

Reviewed changes

Copilot reviewed 43 out of 43 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
inception/inception-support/.../ExtensionPoint_ImplBase.java Adds duplicate extension ID enforcement.
inception/inception-recommendation/.../InteractiveRecommenderSidebar.java Filters interactive sidebar choices to LLM-backed recommenders.
inception/inception-imls-ollama/.../OllamaRecommenderTest.java Updates Ollama recommender tests for the new chat client extension point.
inception/inception-imls-ollama/.../OllamaLlmChatClientTest.java Adds Ollama adapter integration tests.
inception/inception-imls-ollama/.../OllamaClientImplTest.java Skips cloud-only Ollama models in a test path.
inception/inception-imls-ollama/.../OllamaRecommenderTraitsEditor.java Uses the Ollama LLM adapter for model listing.
inception/inception-imls-ollama/.../OllamaRecommenderFactory.java Injects the shared LLM chat client extension point.
inception/inception-imls-ollama/.../OllamaRecommender.java Refactors provider-specific exchange logic into the base class.
inception/inception-imls-ollama/.../OllamaRecommenderAutoConfiguration.java Registers the Ollama LLM adapter bean.
inception/inception-imls-ollama/.../OllamaLlmChatClient.java Adds provider-neutral Ollama chat/stream/embed/model adapter.
inception/inception-imls-ollama/pom.xml Adds security dependency for auth traits.
inception/inception-imls-llm-support/.../LlmChatClientAutoConfiguration.java Registers the LLM chat client extension point.
inception/inception-imls-llm-support/.../UsageInfo.java Adds token usage DTO.
inception/inception-imls-llm-support/.../ToolDescriptor.java Adds provider-neutral tool descriptor DTO.
inception/inception-imls-llm-support/.../ToolCall.java Adds provider-neutral tool-call DTO.
inception/inception-imls-llm-support/.../ModelInfo.java Adds model discovery DTO.
inception/inception-imls-llm-support/.../LlmEndpoint.java Adds provider endpoint/auth/model descriptor.
inception/inception-imls-llm-support/.../LlmChatClientExtensionPointImpl.java Adds extension point implementation for LLM clients.
inception/inception-imls-llm-support/.../LlmChatClientExtensionPoint.java Adds extension point interface for LLM clients.
inception/inception-imls-llm-support/.../LlmChatClient.java Adds provider-neutral client API.
inception/inception-imls-llm-support/.../FinishReason.java Adds normalized finish reason enum.
inception/inception-imls-llm-support/.../ChatResult.java Adds chat result DTO.
inception/inception-imls-llm-support/.../ChatOptions.java Adds chat options DTO.
inception/inception-imls-llm-support/.../ChatChunk.java Adds streaming chunk DTO.
inception/inception-imls-llm-support/.../ChatMessage.java Extends chat messages with thinking/tool metadata.
inception/inception-imls-llm-support/.../ChatBasedLlmRecommenderImplBase.java Centralizes provider exchange through LlmChatClient.
inception/inception-imls-chatgpt/.../OpenAiClientTest.java Fixes test package declaration.
inception/inception-imls-chatgpt/.../ChatGptLlmChatClientTest.java Adds OpenAI-compatible adapter integration tests.
inception/inception-imls-chatgpt/.../ChatGptRecommenderAutoConfiguration.java Registers ChatGPT LLM adapter and updated factory wiring.
inception/inception-imls-chatgpt/.../ChatGptLlmChatClient.java Adds provider-neutral OpenAI-compatible adapter.
inception/inception-imls-chatgpt/.../ChatGptClientImpl.java Marks model listing as interface implementation.
inception/inception-imls-chatgpt/.../ChatGptClient.java Adds model-listing method to the client interface.
inception/inception-imls-chatgpt/.../ChatGptRecommenderTraitsEditor.java Uses the adapter for model listing.
inception/inception-imls-chatgpt/.../ChatGptRecommenderFactory.java Injects the shared LLM chat client extension point.
inception/inception-imls-chatgpt/.../ChatGptRecommender.java Refactors provider-specific exchange into the base class.
inception/inception-imls-chatgpt/pom.xml Adds test support dependency.
inception/inception-imls-azureai-openai/.../AzureAiOpenAiRecommenderAutoConfiguration.java Registers Azure OpenAI adapter and updated factory wiring.
inception/inception-imls-azureai-openai/.../AzureAiOpenAiLlmChatClient.java Adds provider-neutral Azure OpenAI adapter.
inception/inception-imls-azureai-openai/.../AzureAiOpenAiRecommenderFactory.java Injects the shared LLM chat client extension point.
inception/inception-imls-azureai-openai/.../AzureAiOpenAiRecommender.java Refactors provider-specific exchange into the base class.
inception/inception-imls-azureai-openai/pom.xml Cleans dependency XML formatting.
inception/inception-diam/.../EditorAjaxRequestHandlerExtensionPointImpl.java Opts out of unique ID enforcement for ordered request handlers.
inception/inception-assistant/.../AssistantRecommenderFactory.java Marks assistant recommender factory deprecated to hide user creation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@reckart reckart force-pushed the refactoring/6040-Improve-LLM-infrastructure branch 7 times, most recently from a0300c8 to 6ec64a2 Compare May 23, 2026 07:56
@reckart reckart changed the title #6040 - Improve llm infrastructure #6040 - Improve LLM infrastructure May 23, 2026
@reckart reckart force-pushed the refactoring/6040-Improve-LLM-infrastructure branch 3 times, most recently from 7759902 to e50d706 Compare May 25, 2026 16:12
@reckart reckart force-pushed the refactoring/6040-Improve-LLM-infrastructure branch 2 times, most recently from bf523ea to 379a2e8 Compare May 31, 2026 20:40
@reckart reckart requested a review from Copilot June 2, 2026 17:27

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 59 out of 59 changed files in this pull request and generated 3 comments.

@reckart reckart force-pushed the refactoring/6040-Improve-LLM-infrastructure branch 3 times, most recently from 166ab32 to 8e69997 Compare June 7, 2026 19:49
reckart added 3 commits June 7, 2026 22:23
- Introduce abstraction layer to which we could adapt the current separate LLM clients
- Adapt the OpenAI client code into the new abstraction layer
- Adapt the Azure AI client code into the new abstraction layer
reckart added 9 commits June 7, 2026 22:23
- Adapt the Ollama client code into the new abstraction layer
- Fix up a few things here and there
- Fix up a few more things here and there
- Add `TOOL` role and `thinking` + `toolCallId` fields to `ChatMessage`, with a 2-arg constructor kept for existing call sites.
- Translate `ChatOptions.tools` to `OllamaTool` (function name, description, JSON-schema parameters) in `OllamaLlmChatClient.buildChatRequest`.
- Populate `thinking` on the final `ChatMessage` in `OllamaLlmChatClient.toChatResult` and map Ollama's "tool" role to `Role.TOOL`.
- Avoid AI assistant recommender show up in selection dropdown
- Fix recommenders showing up in interactive recommender sidebar
- Introduce `ModelCapability` enum (`CHAT, TOOLS, JSON_SCHEMA, STREAMING, EMBEDDINGS, VISION, THINKING`) for declared per-endpoint/per-model capabilities, distinct from adapter-static support flags.
- Add `Set<ModelCapability> capabilities` to `LlmEndpoint`, with a defensive compact constructor and a back-compat 4-arg constructor that defaults to an empty set.
- Add `capabilities` field to `LlmRecommenderTraits` (defaults to `{JSON_SCHEMA}`); keep `isStructuredOutputSupported()` as a derived `@JsonIgnore` view so legacy persisted JSON still deserializes via the setter.
- Pass `traits.getCapabilities()` through `LlmEndpoint` in `ChatBasedLlmRecommenderImplBase.exchange()`.
- Replace four `supportsTools/JsonSchema/Streaming/Embeddings()` booleans on `LlmChatClient` with a single `supportedCapabilities()` returning `Set<ModelCapability>`; document the distinction between adapter implementation maturity and endpoint configuration (`endpoint.capabilities() ⊆ adapter.supportedCapabilities()`).
- Override `supportedCapabilities()` in `ChatGptLlmChatClient` (`CHAT, JSON_SCHEMA`), `AzureAiOpenAiLlmChatClient` (`CHAT, JSON_SCHEMA`), and `OllamaLlmChatClient` (`CHAT, JSON_SCHEMA, STREAMING, EMBEDDINGS, TOOLS`).
- Fall back to `id` when `displayName` is null in `ModelInfo`'s compact constructor so callers can always render `displayName()`.
- Make `ExtensionPoint_ImplBase.getExtension(String)` null-safe (guard on `aId`, use `aId.equals(fs.getId())`) so registered extensions returning a null id no longer NPE the lookup.
- Add `apiKey` to `OllamaEmbedRequest` and emit `Authorization: Bearer <key>` from `OllamaClientImpl.embed()` when set; thread `apiKey(aEndpoint)` through `OllamaLlmChatClient.embed()`.
- Widen `OllamaClient.listModels(String, String)` to accept an api key, emit the Bearer header in `OllamaClientImpl.listModels()` when non-null, and thread it through `OllamaLlmChatClient.listModels()`.
- Omit the `Authorization: Bearer` header in `ChatGptClientImpl.chat()` and `listModels()` when the api key is null so no-auth OpenAI-compatible endpoints don't receive `Bearer null`.
- Fail fast with `IllegalArgumentException` in `AzureAiOpenAiLlmChatClient.apiKey()` when auth is missing or the api key is blank, since Azure OpenAI always requires a key.
- Materialize Ollama tool-call arguments via `JSONUtil.getObjectMapper().valueToTree(...)` in `OllamaLlmChatClient.toToolCall()` so `ToolCall.arguments()` consumers see proper value nodes instead of `POJONode` wrappers.
- Add static factory `ToolDescriptor.fromMethod(Method)` that derives the wire-side schema from a `@Tool`-annotated Java method via the victools schema generator, mirroring the convenience of `OllamaTool.forMethod` at the provider-neutral abstraction.
- Introduce `ExecutableTool` interface (`descriptor()` + `invoke(JsonNode arguments)`) as the dispatch contract for tools the model may call; concrete impls capture any required runtime context at construction time.
- Introduce `ToolRegistry` interface + `ToolRegistryImpl` (LinkedHashMap-backed, single-threaded) for collecting `ExecutableTool`s by name; duplicate-name registration fails fast.
- Add `MethodTool` — `ExecutableTool` over a `@Tool`-annotated Java method whose parameters are all `@ToolParam`-annotated; construction fails with a clear message on any unannotated parameter (callers needing runtime injection write their own `ExecutableTool`).
- Add unit tests for `MethodTool` (descriptor build, `@ToolParam` binding, Jackson numeric coercion, construction-fails-on-unannotated-param, target exception unwrapping) and `ToolRegistryImpl` (registration, duplicate-fail, unregister, ordering, seed constructor).
- Add `testChatWithTool` integration test in `OllamaLlmChatClientTest` exercising end-to-end tool calling through the abstraction via `ToolDescriptor.fromMethod` on a `@Tool`-annotated method, validating call name, arguments shape, and `valueToTree` value-node typing.
- Introduce `ToolInvoker` interface (`descriptor()` + `invoke(JsonNode arguments)`) in `inception-imls-llm-support` — renamed from the earlier `ExecutableTool` to read as plumbing rather than a consumer-facing "Tool".
- Rename `ToolRegistry` / `ToolRegistryImpl` to `ToolInvokerSet` / `ToolInvokerSetImpl` to convey their short-lived, per-chat-turn nature; rename `register` to `add` and drop the unused `unregister`.
- Add `AssistantRuntimeContext` record in the assistant module holding the per-chat-turn `User`/`Project`/`SourceDocument`/`dataOwner`/`CommandDispatcher` snapshot.
- Add `AssistantToolInvoker` — self-contained `ToolInvoker` implementation that captures `AssistantRuntimeContext` at construction and dispatches `@Tool`-annotated Java methods: `@ToolParam` parameters Jackson-converted from JSON arguments; `AnnotationEditorContext`/`Project`/`SourceDocument`/`CommandDispatcher` parameters bound from the captured context; anything else fails with a clear message; `InvocationTargetException` unwrapped.
- Use the strict `KnownType.class.isAssignableFrom(paramType)` direction for context-parameter resolution so a parameter typed as `Object` (or another supertype) is rejected rather than silently bound to the first matching context type — a latent footgun in the pre-abstraction `MToolCall.invoke`.
- Delete the speculative `MethodTool` utility and its test: it had no real consumers and its only proposed use was as an inheritance parent for `AssistantToolInvoker`, which the latter does not need.
- Add `ToolInvokerSetImplTest` (5 tests) and `AssistantToolInvokerTest` (9 tests) — pure unit tests covering set semantics and every parameter-resolution branch with no Spring context and no LLM in the loop.
- Widen `LlmChatClient.embed()` signature with a `Map<String, Object> aOptions` parameter for provider-specific knobs (Ollama `num_ctx`/`seed`, OpenAI `dimensions`/`encoding_format`, ...); documented on the interface.
- Update `OllamaLlmChatClient.embed()` to pass the options map through to `OllamaEmbedRequest.withOptions()`.
- Update `OllamaLlmChatClientTest#testEmbed` to call the widened signature (passes `null` — no options needed for the test).
- Migrate `EmbeddingServiceImpl` to consume `LlmChatClientExtensionPoint` instead of the native `OllamaClient`: resolves the Ollama adapter by id, builds an `LlmEndpoint` from `AssistantProperties`, passes `num_ctx`/`seed` via the options map. Pair-shape return type of `EmbeddingService` preserved by synthesising `Pair<String, float[]>` locally from the input strings and the returned `List<float[]>`.
- Wire `AssistantAutoConfiguration#EmbeddingService` bean to inject `LlmChatClientExtensionPoint` instead of `OllamaClient`.
- Update `UserGuideQueryServiceImplTest` — the only other `EmbeddingServiceImpl` construction site — to build a `LlmChatClientExtensionPointImpl` populated with an `OllamaLlmChatClient` and pass that.
@reckart reckart force-pushed the refactoring/6040-Improve-LLM-infrastructure branch from 8e69997 to e9f2e32 Compare June 7, 2026 20:23
@reckart reckart requested a review from Copilot June 7, 2026 20:23

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 59 out of 59 changed files in this pull request and generated no new comments.

@reckart reckart merged commit c78e993 into main Jun 7, 2026
5 checks passed
@reckart reckart deleted the refactoring/6040-Improve-LLM-infrastructure branch June 7, 2026 20:54
@github-project-automation github-project-automation Bot moved this from 🔖 To do to 🍹 Done in Kanban Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: 🍹 Done

Development

Successfully merging this pull request may close these issues.

2 participants