Xid (PCI:0000:2b:00): 31 MMU Fault: ENGINE GRAPHICS GPC6 GPCCLIENT_T1_13 faulted @ 0x6f17_04027000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE

### NVIDIA Open GPU Kernel Modules Version

590.48.01

### Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

- [ ] I confirm that this does not happen with the proprietary driver package.

### Operating System and Version

openSUSE Tumbleweed

### Kernel Release

6.18.9-1-default

### Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

- [ ] I am running on a stable kernel release.

### Hardware: GPU

GPU 0: NVIDIA GeForce RTX 5090

### Describe the bug

There is an issue with the linux kernel driver. Can't run ollama or llama.ccp without getting an error, and this is working fine under windows, so seems this is not an Hardware or a firmware issue.

## llama.ccp
./llama-bench -m ~/testcuda/qwen2.5-coder-3b-instruct-q4_0.gguf 
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
| model                          |       size |     params | backend    | ngl |          test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------: | -------------------: |
RMS_NORM: src0_d=0x7f6f32000000 (attr_err=0, type=2), dst_d=0x7f6f32501000 (attr_err=0, type=2)
  ne00=2048, ne01=512, ne02=1, ne03=1, s01=2048, s02=1048576, s03=1048576
MUL_OP: src0=0x7f6f32501000 (err=0, type=2), src1=0x7f6e7136f800 (err=0, type=2), dst=0x7f6f32501000 (err=0, type=2)
  src0: type=f32, shape=[2048,512,1,1]
  src1: type=f32, shape=[2048,1,1,1]
  dst: type=f32, shape=[2048,512,1,1]

!!! CUDA KERNEL FAILED: op=MUL (6)
    src0: f32, shape=[2048,512,1,1]
    dst: f32, shape=[2048,512,1,1]

=== CUDA ERROR DETAILS ===
CUDA error: an illegal memory access was encountered
  current device: 0
  function: ggml_cuda_compute_forward
  location: /home/aginies/open-gpu-kernel-modules/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:2332
  statement: err

## With ollama
ollama serve
level=INFO source=types.go:42 msg="inference compute" id=GPU-446f513c-0699-5337-2cd0-9fa3d507cc94 filter_id="" library=CUDA compute=12.0 name=CUDA0 description="NVIDIA GeForce RTX 5090" libdirs=ollama,cuda_v13 driver=13.1 pci_id=0000:2b:00.0 type=discrete total="31.8 GiB" available="31.3 GiB"

ollama run llama3.2
 time=2026-02-18T18:39:02.879+01:00 level=ERROR source=server.go:304 msg="llama runner terminated" error="exit status 2"
⠦ time=2026-02-18T18:39:03.117+01:00 level=INFO source=sched.go:493 msg="Load failed" model=/home/aginies/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff error="llama runner process has terminated: CUDA error: an illegal memory access was encountered\n  current device: 0, in function ggml_backend_cuda_buffer_clear at //ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:790\n  cudaStreamSynchronize(((cudaStream_t)0x2))\n//ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:94: CUDA error"
[GIN] 2026/02/18 - 18:39:03 | 500 |  4.656141722s |       127.0.0.1 | POST     "/api/generate"
Error: 500 Internal Server Error: llama runner process has terminated: CUDA error: an illegal memory access was encountered
  current device: 0, in function ggml_backend_cuda_buffer_clear at //ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:790
  cudaStreamSynchronize(((cudaStream_t)0x2))
//ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:94: CUDA error

### dmesg
[  130.456250] [   T1946] NVRM: Xid (PCI:0000:2b:00): 31, pid=1942, name=llama-bench, channel 0x00000002, intr 00000000. MMU Fault: ENGINE GRAPHICS GPC6 GPCCLIENT_T1_13 faulted @ 0x6f6f_320c2000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
[  130.467839] [   T1964] llama-bench[1964]: segfault at 206e03fe0 ip 00007f71b508324c sp 00007ffe8b911a10 error 4 in libcuda.so.590.48.01[48324c,7f71b4d67000+108c000] likely on CPU 11 (core 11, socket 0)
[  130.467844] [   T1964] Code: ce ff 83 3d 81 03 9b 05 01 49 8b 1c 24 76 0e 8b 05 89 03 9b 05 85 c0 0f 84 91 00 00 00 49 8b 44 24 10 41 8b 4c 24 24 48 8b 13 <8b> 00 41 39 c5 0f 84 89 00 00 00 8b b3 40 40 00 00 48 89 f0 89 8c


## some testing
Sounds like there is an issue with chunk size on my card:
test_chunk_sizes.cu: https://paste.opensuse.org/pastes/5d419b50384b
I build it with nvcc, and test it:

./test_chunk_sizes 
Testing chunk sizes to find failure boundary...
Testing chunk size 512KB (524288 bytes)... PASS
Testing chunk size 576KB (589824 bytes)... PASS
Testing chunk size 600KB (614400 bytes)... PASS
Testing chunk size 620KB (634880 bytes)... PASS
Testing chunk size 630KB (645120 bytes)... PASS
Testing chunk size 635KB (650240 bytes)... PASS
Testing chunk size 639KB (654336 bytes)... PASS
Testing chunk size 640KB (655360 bytes)... cudaDeviceSynchronize failed at offset 0: an illegal memory access was encountered
FAIL
Testing chunk size 641KB (656384 bytes)... cudaMalloc failed: CUDA-capable device(s) is/are busy or unavailable

### dmesg
NVRM: Xid (PCI:0000:2b:00): 31, pid=2250, name=test_chunk_size, channel 0x00000002, intr 00000000. MMU Fault: ENGINE GRAPHICS GPC6 GPCCLIENT_T1_13 faulted @ 0x6fb6_c401b000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE


### To Reproduce

ollama or llama.cpp using any LLM will cause the issue.



### Bug Incidence

Always

### nvidia-bug-report.log.gz

[nvidia-bug-report.log.gz](https://github.com/user-attachments/files/25397023/nvidia-bug-report.log.gz)

### More Info

reproducible on Leap15.6, archlinux, ubuntu etc...
Tried multiple 5XX.XX Nividia drivers with different kernel version, always same error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xid (PCI:0000:2b:00): 31 MMU Fault: ENGINE GRAPHICS GPC6 GPCCLIENT_T1_13 faulted @ 0x6f17_04027000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE #1030

NVIDIA Open GPU Kernel Modules Version

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

Operating System and Version

Kernel Release

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

Hardware: GPU

Describe the bug

llama.ccp

With ollama

dmesg

some testing

dmesg

To Reproduce

Bug Incidence

nvidia-bug-report.log.gz

More Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

model	size	params	backend	ngl	test	t/s
RMS_NORM: src0_d=0x7f6f32000000 (attr_err=0, type=2), dst_d=0x7f6f32501000 (attr_err=0, type=2)
ne00=2048, ne01=512, ne02=1, ne03=1, s01=2048, s02=1048576, s03=1048576
MUL_OP: src0=0x7f6f32501000 (err=0, type=2), src1=0x7f6e7136f800 (err=0, type=2), dst=0x7f6f32501000 (err=0, type=2)
src0: type=f32, shape=[2048,512,1,1]
src1: type=f32, shape=[2048,1,1,1]
dst: type=f32, shape=[2048,512,1,1]

Xid (PCI:0000:2b:00): 31 MMU Fault: ENGINE GRAPHICS GPC6 GPCCLIENT_T1_13 faulted @ 0x6f17_04027000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE #1030

Description

NVIDIA Open GPU Kernel Modules Version

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

Operating System and Version

Kernel Release

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

Hardware: GPU

Describe the bug

llama.ccp

With ollama

dmesg

some testing

dmesg

To Reproduce

Bug Incidence

nvidia-bug-report.log.gz

More Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions