Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
b6de92d
Migrate cross-repo CLAUDE.md sections to workspace pointers
claude Jun 3, 2026
79f1fff
Point at versioned workspace guides (Java 8 baseline only)
claude Jun 4, 2026
0a97ae7
Bump safe dependency / plugin versions
claude Jun 4, 2026
e673471
test(archunit): pin args sub-package as a true leaf
claude Jun 4, 2026
e36f631
docs: README + CLAUDE.md system-properties reference deep-scan
claude Jun 4, 2026
3ae6c81
Route OSInfo.getArchName() through LlamaSystemProperties.getOsinfoArc…
claude Jun 4, 2026
28dc9e6
Remove the lib.name documentation lie + dead LlamaSystemProperties.ge…
claude Jun 4, 2026
3248c1c
Add per-run timing line on net.ladenthin.llama.timings SLF4J logger
claude Jun 4, 2026
337d266
docs: extract Open TODOs into TODO.md
claude Jun 4, 2026
160aa65
build: add Lombok 1.18.46 (provided scope) + lombok.config
claude Jun 5, 2026
baffa37
docs(README): add Lombok badge; promote llama.cpp version badge
claude Jun 5, 2026
9be73a3
refactor: Lombok @ToString/@EqualsAndHashCode across jllama productio…
claude Jun 5, 2026
ce8b466
spotbugs: suppress USBR on equals/hashCode/canEqual/toString (Lombok)
claude Jun 5, 2026
07109cc
spotbugs(OPM) suppress OPM_OVERLY_PERMISSIVE_METHOD project-wide
claude Jun 6, 2026
2f2d8c8
docs(TODO): record OPM project-wide suppression + refresh SpotBugs Ma…
claude Jun 6, 2026
7e4fd5a
spotbugs(DRE) Batch 1: drop 'throws LlamaException' from 20 LlamaMode…
claude Jun 6, 2026
5fd7b4d
spotbugs(WEM+THROWS) Batch 2: enrich ModelParameters validations (6 c…
claude Jun 6, 2026
311f8d6
spotbugs(WEM) Batch 3: enrich Session IllegalStateException messages …
claude Jun 6, 2026
07cabfb
spotbugs(WEM+UVA) Batches 4+5: leaf-class enrichments + array→varargs…
claude Jun 6, 2026
c6feef7
refactor: remove LlamaPublisher in favour of consumer-side reactive a…
claude Jun 6, 2026
3d5f848
spotbugs(RCN+REC) ChatRequest source cleanup (4 cleared)
claude Jun 6, 2026
f97c85d
spotbugs(RCN) drop dead null branch in ChatMessage.requireNonEmpty (1…
claude Jun 6, 2026
3a128e1
refactor: drop @PolyNull, simplify InferenceParameters null-set seman…
claude Jun 6, 2026
d31a906
refactor: migrate ChatMessage.getToolCallId + ChatRequest.getToolChoi…
claude Jun 6, 2026
eb55f58
refactor: extract ChatTranscript with two-phase commit semantics from…
claude Jun 6, 2026
647f517
refactor(ChatRequest): immutable + wither/append pattern
claude Jun 6, 2026
c42a2fc
fix(ChatMessage): restore IllegalArgumentException on null parts list
claude Jun 6, 2026
4f1fbd7
refactor(InferenceParameters): immutable + wither/append pattern
claude Jun 6, 2026
6ddd225
lombok: force field-access in @EqualsAndHashCode / @ToString
claude Jun 6, 2026
1b427a9
docs(CLAUDE.md): point Lombok Config section at the canonical workspa…
claude Jun 6, 2026
14091bf
spotbugs: clear remaining Max+Low findings (DLS + SPP + RCN + Lombok …
claude Jun 6, 2026
c3a26b9
spotbugs: flip pom to Max+Low at the gate; clear remaining 8 source-l…
claude Jun 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 13 additions & 105 deletions CLAUDE.md

Large diffs are not rendered by default.

85 changes: 76 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
**Build:**
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey)
[![llama.cpp b9495](https://img.shields.io/badge/llama.cpp-%23b9495-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9495)
[![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/)
![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162)
[![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev)
[![NullAway](https://img.shields.io/badge/NullAway-strict%20JSpecify-25A162)](https://github.com/uber/NullAway)
[![Checker Framework](https://img.shields.io/badge/Checker%20Framework-Nullness-25A162)](https://checkerframework.org)
[![Error Prone](https://img.shields.io/badge/Error%20Prone-12%20patterns%20at%20ERROR-25A162)](https://errorprone.info)
[![Maven Enforcer](https://img.shields.io/badge/Maven%20Enforcer-strict-25A162)](https://maven.apache.org/enforcer/)
[![Lombok](https://img.shields.io/badge/Lombok-1.18.46-bc3f3c)](https://projectlombok.org/)
[![jqwik](https://img.shields.io/badge/tested%20with-jqwik-1f6feb)](https://jqwik.net)
[![ArchUnit](https://img.shields.io/badge/tested%20with-ArchUnit-c71a36)](https://www.archunit.org)
[![SpotBugs](https://img.shields.io/badge/analyzed%20with-SpotBugs-3b5998)](https://spotbugs.github.io)
[![jcstress](https://img.shields.io/badge/tested%20with-jcstress-007396)](https://openjdk.org/projects/code-tools/jcstress/)
[![Lincheck](https://img.shields.io/badge/tested%20with-Lincheck-7F52FF)](https://github.com/JetBrains/lincheck)
[![vmlens](https://img.shields.io/badge/tested%20with-vmlens-ff6f00)](https://vmlens.com)
[![JMH](https://img.shields.io/badge/benchmarked%20with-JMH-25A162)](https://openjdk.org/projects/code-tools/jmh/)
[![llama.cpp b9495](https://img.shields.io/badge/llama.cpp-%23b9495-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9495)
[![Publish](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/publish.yml/badge.svg)](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/publish.yml)
[![CodeQL](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/codeql.yml/badge.svg)](https://github.com/bernardladenthin/java-llama.cpp/actions/workflows/codeql.yml)

Expand Down Expand Up @@ -249,15 +250,20 @@ The application will search in the following order in the following locations:

#### System Properties Reference

All `net.ladenthin.llama.*` system properties are resolved by `LlamaSystemProperties`.
Every `net.ladenthin.llama.*` system property recognised by the library, deep-scanned from the source. Runtime properties are resolved through `LlamaSystemProperties`; test-only properties are declared in the test sources (`TestConstants`) and consumed by individual test classes.

| Property | Description |
|---|---|
| `net.ladenthin.llama.lib.path` | Directory containing the native `jllama` shared library. Checked first, before `java.library.path`. |
| `net.ladenthin.llama.lib.name` | Override the native library filename (default is platform-determined, e.g. `jllama.so`). |
| `net.ladenthin.llama.tmpdir` | Custom temporary directory used when extracting the native library from the JAR. Falls back to `java.io.tmpdir`. |
| `net.ladenthin.llama.osinfo.architecture` | Override the OS/architecture string used to locate the bundled library inside the JAR (e.g. `Linux/x86_64`). Useful for non-standard JVM environments. |
| `net.ladenthin.llama.test.ngl` | Number of GPU layers used during testing. Parsed by the test suite; not relevant for production use. |
| Property | Default | Scope | Consumer | Description |
|---|---|---|---|---|
| `net.ladenthin.llama.lib.path` | unset (falls back to `java.library.path`) | runtime | `LlamaLoader` | Directory containing the native `jllama` shared library. Checked first, before `java.library.path`. Set with `-Dnet.ladenthin.llama.lib.path=/path/to/dir`. |
| `net.ladenthin.llama.tmpdir` | unset (falls back to `java.io.tmpdir`) | runtime | `LlamaLoader` | Custom temporary directory used when extracting the native library from the JAR. |
| `net.ladenthin.llama.osinfo.architecture` | unset (uses `os.arch`) | runtime | `OSInfo` | Override for the architecture string used to locate the bundled library inside the JAR. Useful when `os.arch` reports an unexpected value (e.g. inside dockcross / chrooted environments). |
| `net.ladenthin.llama.test.ngl` | `43` | test | `LlamaModelTest`, `RerankingModelTest`, `ChatScenarioTest`, `ChatAdvancedTest`, `ErrorHandlingTest`, `SessionConcurrencyTest`, `ConfigureParallelInferenceTest`, `MultimodalIntegrationTest` (via `Integer.getInteger(TestConstants.PROP_TEST_NGL, TestConstants.DEFAULT_TEST_NGL)`) | Number of GPU layers used during testing. Pin to `0` on CPU-only hosts: `mvn test -Dnet.ladenthin.llama.test.ngl=0`. |
| `net.ladenthin.llama.nomic.path` | unset (test self-skips) | test | `LlamaEmbeddingsTest#testNomicEmbedLoads` | Path to a Nomic embedding model (`nomic-embed-text-v1.5.f16.gguf` or a compatible BERT-family encoder). Regression test for upstream issue #98 (BERT-encoder `result_output` assertion). |
| `net.ladenthin.llama.vision.model` | unset (test self-skips) | test | `MultimodalIntegrationTest` (closes #103 / #34) | Path to a vision-capable model GGUF. Any vision-capable GGUF works; CI default is `SmolVLM-500M-Instruct-Q8_0.gguf`. |
| `net.ladenthin.llama.vision.mmproj` | unset (test self-skips) | test | `MultimodalIntegrationTest` | Matching mmproj GGUF for the vision model. |
| `net.ladenthin.llama.vision.image` | `src/test/resources/images/test-image.jpg` (a CC-BY-4.0 / MIT-granted photo committed to the repo) | test | `MultimodalIntegrationTest` | Visual prompt image. Any png/jpeg/webp/gif works; the extension drives MIME detection. |

`MultimodalIntegrationTest` self-skips when any of the three `vision.*` properties points at a missing path, so a partial setup (just the vision model + the committed image, no mmproj) lets the test class load without erroring.

## Documentation

Expand Down Expand Up @@ -411,6 +417,67 @@ try (LlamaModel model = new LlamaModel(modelParams)) {
}
```

### Reactive integration (Reactor, RxJava, Kotlin Flow, Akka)

`LlamaIterable` (returned by `model.generate(...)` and `model.generateChat(...)`)
implements `Iterable<LlamaOutput> & AutoCloseable`, so every mainstream reactive
library wraps it in a few lines without `java-llama.cpp` pulling in a runtime
reactive dependency.

**Always wrap with the library's resource-management primitive** — `Flux.using`,
`Flowable.using`, Kotlin `use {}`, etc. — so that subscription cancellation
flows into `LlamaIterable.close()` and from there into llama.cpp's native
`cancelCompletion`. A plain `Flux.fromIterable(iterable)` or `for (x in iter)`
loop will NOT close the iterable on cancel; the native task slot stays
occupied until the model is closed.

#### Project Reactor (Spring WebFlux)
```java
Flux<LlamaOutput> tokens = Flux.using(
() -> model.generate(params),
Flux::fromIterable,
LlamaIterable::close)
.subscribeOn(Schedulers.boundedElastic());
```

#### RxJava 3 (also for RxAndroid)
```java
Flowable<LlamaOutput> tokens = Flowable.using(
() -> model.generate(params),
Flowable::fromIterable,
LlamaIterable::close)
.subscribeOn(Schedulers.io());
```

#### Kotlin Flow (Android / coroutines)
```kotlin
fun llama(model: LlamaModel, params: InferenceParameters) = flow {
model.generate(params).use { iterable ->
for (output in iterable) emit(output)
}
}.flowOn(Dispatchers.IO)
```
The companion Android sample [LLaMAndroid](https://github.com/Rattlyy/LLaMAndroid)
demonstrates the `flow { for (output in model.generate(params)) emit(output) }`
shape against the upstream binding. Wrap the `for` loop in
`.use { }` if your collector may cancel mid-stream — otherwise the native task
slot will not be released until the model is closed.

#### Akka Streams
```scala
val tokens: Source[LlamaOutput, NotUsed] = Source
.fromIterator(() => model.generate(params).iterator())
.async("blocking-io-dispatcher")
```

**Why no built-in `Publisher`?** Earlier snapshots of this fork shipped a
hand-rolled `LlamaModel.streamPublisher(...)` returning a Reactive Streams
`Publisher<LlamaOutput>`. Since every reactive library bridges blocking
iterables in a few lines via its own resource-management primitive, the binding
now stays free of any reactive runtime dependency — pick whichever library your
app already uses. The pattern is verified end-to-end by
`ReactorIntegrationTest` in the test sources.

### Logging

Per default, logs are written to stdout.
Expand Down
Loading
Loading