#6040 - Improve LLM infrastructure#6041
Merged
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a provider-neutral LLM chat client layer and refactors existing LLM recommenders to route through it, while tightening extension-point ID handling and adjusting recommender/sidebar behavior.
Changes:
- Added
LlmChatClientabstractions, request/response records, extension point, and Ollama/OpenAI/Azure adapters. - Refactored Ollama, ChatGPT, and Azure OpenAI recommenders/factories/editors to use the shared client infrastructure.
- Added/updated integration tests and tightened extension ID uniqueness with an opt-out for editor AJAX handlers.
Reviewed changes
Copilot reviewed 43 out of 43 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
inception/inception-support/.../ExtensionPoint_ImplBase.java |
Adds duplicate extension ID enforcement. |
inception/inception-recommendation/.../InteractiveRecommenderSidebar.java |
Filters interactive sidebar choices to LLM-backed recommenders. |
inception/inception-imls-ollama/.../OllamaRecommenderTest.java |
Updates Ollama recommender tests for the new chat client extension point. |
inception/inception-imls-ollama/.../OllamaLlmChatClientTest.java |
Adds Ollama adapter integration tests. |
inception/inception-imls-ollama/.../OllamaClientImplTest.java |
Skips cloud-only Ollama models in a test path. |
inception/inception-imls-ollama/.../OllamaRecommenderTraitsEditor.java |
Uses the Ollama LLM adapter for model listing. |
inception/inception-imls-ollama/.../OllamaRecommenderFactory.java |
Injects the shared LLM chat client extension point. |
inception/inception-imls-ollama/.../OllamaRecommender.java |
Refactors provider-specific exchange logic into the base class. |
inception/inception-imls-ollama/.../OllamaRecommenderAutoConfiguration.java |
Registers the Ollama LLM adapter bean. |
inception/inception-imls-ollama/.../OllamaLlmChatClient.java |
Adds provider-neutral Ollama chat/stream/embed/model adapter. |
inception/inception-imls-ollama/pom.xml |
Adds security dependency for auth traits. |
inception/inception-imls-llm-support/.../LlmChatClientAutoConfiguration.java |
Registers the LLM chat client extension point. |
inception/inception-imls-llm-support/.../UsageInfo.java |
Adds token usage DTO. |
inception/inception-imls-llm-support/.../ToolDescriptor.java |
Adds provider-neutral tool descriptor DTO. |
inception/inception-imls-llm-support/.../ToolCall.java |
Adds provider-neutral tool-call DTO. |
inception/inception-imls-llm-support/.../ModelInfo.java |
Adds model discovery DTO. |
inception/inception-imls-llm-support/.../LlmEndpoint.java |
Adds provider endpoint/auth/model descriptor. |
inception/inception-imls-llm-support/.../LlmChatClientExtensionPointImpl.java |
Adds extension point implementation for LLM clients. |
inception/inception-imls-llm-support/.../LlmChatClientExtensionPoint.java |
Adds extension point interface for LLM clients. |
inception/inception-imls-llm-support/.../LlmChatClient.java |
Adds provider-neutral client API. |
inception/inception-imls-llm-support/.../FinishReason.java |
Adds normalized finish reason enum. |
inception/inception-imls-llm-support/.../ChatResult.java |
Adds chat result DTO. |
inception/inception-imls-llm-support/.../ChatOptions.java |
Adds chat options DTO. |
inception/inception-imls-llm-support/.../ChatChunk.java |
Adds streaming chunk DTO. |
inception/inception-imls-llm-support/.../ChatMessage.java |
Extends chat messages with thinking/tool metadata. |
inception/inception-imls-llm-support/.../ChatBasedLlmRecommenderImplBase.java |
Centralizes provider exchange through LlmChatClient. |
inception/inception-imls-chatgpt/.../OpenAiClientTest.java |
Fixes test package declaration. |
inception/inception-imls-chatgpt/.../ChatGptLlmChatClientTest.java |
Adds OpenAI-compatible adapter integration tests. |
inception/inception-imls-chatgpt/.../ChatGptRecommenderAutoConfiguration.java |
Registers ChatGPT LLM adapter and updated factory wiring. |
inception/inception-imls-chatgpt/.../ChatGptLlmChatClient.java |
Adds provider-neutral OpenAI-compatible adapter. |
inception/inception-imls-chatgpt/.../ChatGptClientImpl.java |
Marks model listing as interface implementation. |
inception/inception-imls-chatgpt/.../ChatGptClient.java |
Adds model-listing method to the client interface. |
inception/inception-imls-chatgpt/.../ChatGptRecommenderTraitsEditor.java |
Uses the adapter for model listing. |
inception/inception-imls-chatgpt/.../ChatGptRecommenderFactory.java |
Injects the shared LLM chat client extension point. |
inception/inception-imls-chatgpt/.../ChatGptRecommender.java |
Refactors provider-specific exchange into the base class. |
inception/inception-imls-chatgpt/pom.xml |
Adds test support dependency. |
inception/inception-imls-azureai-openai/.../AzureAiOpenAiRecommenderAutoConfiguration.java |
Registers Azure OpenAI adapter and updated factory wiring. |
inception/inception-imls-azureai-openai/.../AzureAiOpenAiLlmChatClient.java |
Adds provider-neutral Azure OpenAI adapter. |
inception/inception-imls-azureai-openai/.../AzureAiOpenAiRecommenderFactory.java |
Injects the shared LLM chat client extension point. |
inception/inception-imls-azureai-openai/.../AzureAiOpenAiRecommender.java |
Refactors provider-specific exchange into the base class. |
inception/inception-imls-azureai-openai/pom.xml |
Cleans dependency XML formatting. |
inception/inception-diam/.../EditorAjaxRequestHandlerExtensionPointImpl.java |
Opts out of unique ID enforcement for ordered request handlers. |
inception/inception-assistant/.../AssistantRecommenderFactory.java |
Marks assistant recommender factory deprecated to hide user creation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a0300c8 to
6ec64a2
Compare
7759902 to
e50d706
Compare
bf523ea to
379a2e8
Compare
166ab32 to
8e69997
Compare
- Introduce abstraction layer to which we could adapt the current separate LLM clients
- Adapt the OpenAI client code into the new abstraction layer
- Adapt the Azure AI client code into the new abstraction layer
- Adapt the Ollama client code into the new abstraction layer
- Fix up a few things here and there
- Fix up a few more things here and there
- Add `TOOL` role and `thinking` + `toolCallId` fields to `ChatMessage`, with a 2-arg constructor kept for existing call sites. - Translate `ChatOptions.tools` to `OllamaTool` (function name, description, JSON-schema parameters) in `OllamaLlmChatClient.buildChatRequest`. - Populate `thinking` on the final `ChatMessage` in `OllamaLlmChatClient.toChatResult` and map Ollama's "tool" role to `Role.TOOL`.
- Avoid AI assistant recommender show up in selection dropdown - Fix recommenders showing up in interactive recommender sidebar
- Introduce `ModelCapability` enum (`CHAT, TOOLS, JSON_SCHEMA, STREAMING, EMBEDDINGS, VISION, THINKING`) for declared per-endpoint/per-model capabilities, distinct from adapter-static support flags.
- Add `Set<ModelCapability> capabilities` to `LlmEndpoint`, with a defensive compact constructor and a back-compat 4-arg constructor that defaults to an empty set.
- Add `capabilities` field to `LlmRecommenderTraits` (defaults to `{JSON_SCHEMA}`); keep `isStructuredOutputSupported()` as a derived `@JsonIgnore` view so legacy persisted JSON still deserializes via the setter.
- Pass `traits.getCapabilities()` through `LlmEndpoint` in `ChatBasedLlmRecommenderImplBase.exchange()`.
- Replace four `supportsTools/JsonSchema/Streaming/Embeddings()` booleans on `LlmChatClient` with a single `supportedCapabilities()` returning `Set<ModelCapability>`; document the distinction between adapter implementation maturity and endpoint configuration (`endpoint.capabilities() ⊆ adapter.supportedCapabilities()`).
- Override `supportedCapabilities()` in `ChatGptLlmChatClient` (`CHAT, JSON_SCHEMA`), `AzureAiOpenAiLlmChatClient` (`CHAT, JSON_SCHEMA`), and `OllamaLlmChatClient` (`CHAT, JSON_SCHEMA, STREAMING, EMBEDDINGS, TOOLS`).
- Fall back to `id` when `displayName` is null in `ModelInfo`'s compact constructor so callers can always render `displayName()`.
- Make `ExtensionPoint_ImplBase.getExtension(String)` null-safe (guard on `aId`, use `aId.equals(fs.getId())`) so registered extensions returning a null id no longer NPE the lookup.
- Add `apiKey` to `OllamaEmbedRequest` and emit `Authorization: Bearer <key>` from `OllamaClientImpl.embed()` when set; thread `apiKey(aEndpoint)` through `OllamaLlmChatClient.embed()`.
- Widen `OllamaClient.listModels(String, String)` to accept an api key, emit the Bearer header in `OllamaClientImpl.listModels()` when non-null, and thread it through `OllamaLlmChatClient.listModels()`.
- Omit the `Authorization: Bearer` header in `ChatGptClientImpl.chat()` and `listModels()` when the api key is null so no-auth OpenAI-compatible endpoints don't receive `Bearer null`.
- Fail fast with `IllegalArgumentException` in `AzureAiOpenAiLlmChatClient.apiKey()` when auth is missing or the api key is blank, since Azure OpenAI always requires a key.
- Materialize Ollama tool-call arguments via `JSONUtil.getObjectMapper().valueToTree(...)` in `OllamaLlmChatClient.toToolCall()` so `ToolCall.arguments()` consumers see proper value nodes instead of `POJONode` wrappers.
- Add static factory `ToolDescriptor.fromMethod(Method)` that derives the wire-side schema from a `@Tool`-annotated Java method via the victools schema generator, mirroring the convenience of `OllamaTool.forMethod` at the provider-neutral abstraction. - Introduce `ExecutableTool` interface (`descriptor()` + `invoke(JsonNode arguments)`) as the dispatch contract for tools the model may call; concrete impls capture any required runtime context at construction time. - Introduce `ToolRegistry` interface + `ToolRegistryImpl` (LinkedHashMap-backed, single-threaded) for collecting `ExecutableTool`s by name; duplicate-name registration fails fast. - Add `MethodTool` — `ExecutableTool` over a `@Tool`-annotated Java method whose parameters are all `@ToolParam`-annotated; construction fails with a clear message on any unannotated parameter (callers needing runtime injection write their own `ExecutableTool`). - Add unit tests for `MethodTool` (descriptor build, `@ToolParam` binding, Jackson numeric coercion, construction-fails-on-unannotated-param, target exception unwrapping) and `ToolRegistryImpl` (registration, duplicate-fail, unregister, ordering, seed constructor). - Add `testChatWithTool` integration test in `OllamaLlmChatClientTest` exercising end-to-end tool calling through the abstraction via `ToolDescriptor.fromMethod` on a `@Tool`-annotated method, validating call name, arguments shape, and `valueToTree` value-node typing.
- Introduce `ToolInvoker` interface (`descriptor()` + `invoke(JsonNode arguments)`) in `inception-imls-llm-support` — renamed from the earlier `ExecutableTool` to read as plumbing rather than a consumer-facing "Tool". - Rename `ToolRegistry` / `ToolRegistryImpl` to `ToolInvokerSet` / `ToolInvokerSetImpl` to convey their short-lived, per-chat-turn nature; rename `register` to `add` and drop the unused `unregister`. - Add `AssistantRuntimeContext` record in the assistant module holding the per-chat-turn `User`/`Project`/`SourceDocument`/`dataOwner`/`CommandDispatcher` snapshot. - Add `AssistantToolInvoker` — self-contained `ToolInvoker` implementation that captures `AssistantRuntimeContext` at construction and dispatches `@Tool`-annotated Java methods: `@ToolParam` parameters Jackson-converted from JSON arguments; `AnnotationEditorContext`/`Project`/`SourceDocument`/`CommandDispatcher` parameters bound from the captured context; anything else fails with a clear message; `InvocationTargetException` unwrapped. - Use the strict `KnownType.class.isAssignableFrom(paramType)` direction for context-parameter resolution so a parameter typed as `Object` (or another supertype) is rejected rather than silently bound to the first matching context type — a latent footgun in the pre-abstraction `MToolCall.invoke`. - Delete the speculative `MethodTool` utility and its test: it had no real consumers and its only proposed use was as an inheritance parent for `AssistantToolInvoker`, which the latter does not need. - Add `ToolInvokerSetImplTest` (5 tests) and `AssistantToolInvokerTest` (9 tests) — pure unit tests covering set semantics and every parameter-resolution branch with no Spring context and no LLM in the loop.
- Widen `LlmChatClient.embed()` signature with a `Map<String, Object> aOptions` parameter for provider-specific knobs (Ollama `num_ctx`/`seed`, OpenAI `dimensions`/`encoding_format`, ...); documented on the interface. - Update `OllamaLlmChatClient.embed()` to pass the options map through to `OllamaEmbedRequest.withOptions()`. - Update `OllamaLlmChatClientTest#testEmbed` to call the widened signature (passes `null` — no options needed for the test). - Migrate `EmbeddingServiceImpl` to consume `LlmChatClientExtensionPoint` instead of the native `OllamaClient`: resolves the Ollama adapter by id, builds an `LlmEndpoint` from `AssistantProperties`, passes `num_ctx`/`seed` via the options map. Pair-shape return type of `EmbeddingService` preserved by synthesising `Pair<String, float[]>` locally from the input strings and the returned `List<float[]>`. - Wire `AssistantAutoConfiguration#EmbeddingService` bean to inject `LlmChatClientExtensionPoint` instead of `OllamaClient`. - Update `UserGuideQueryServiceImplTest` — the only other `EmbeddingServiceImpl` construction site — to build a `LlmChatClientExtensionPointImpl` populated with an `OllamaLlmChatClient` and pass that.
8e69997 to
e9f2e32
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's in the PR
How to test manually
Automatic testing
Documentation