Support GPU accelerated Whisper on more platforms

Dash3897 · March 8, 2026, 9:08pm

Currently, only Apple silicon supports GPU acceleration, but is there any plans to maybe expand that support to Linux and Windows? Perhaps use something that works everywhere like Vulkan? I remember a program that uses Vulkan for GPU accelerated Speech to text. I think it was Speech Note

The reason I’m asking this is quite frankly because the CPU is simply too slow and in a lot of cases the GPU would finish the task quicker then the CPU.

shotcut · March 9, 2026, 11:18pm

I added this, but I found it can be the source of a very bad user experience! Here is a summary of an AI conversation troubleshooting failures:

Right now whisper.cpp is using Vulkan, but it’s using the Dozen (D3D12→Vulkan) ICD, and that’s what’s blowing up:

ggml_vulkan: 0 = Microsoft Direct3D12 (Qualcomm(R) Adreno™ X1-85 GPU) (Dozen) … ggml_vulkan: Compute pipeline creation failed for matmul_f16_f16acc_l ggml_vulkan: vk::Device::createComputePipeline: ErrorOutOfHostMemory

That ErrorOutOfHostMemory is almost certainly a Dozen bug / limitation, not your actual host RAM.

The fix is: stop using Dozen, force the native Qualcomm Vulkan ICD.

…

Your vulkaninfo output confirms something important: the Qualcomm ICD is installed and visible, and the Dozen ICD is also present. That means the Vulkan loader can pick the native Adreno driver — but whisper.cpp is still binding to Dozen, which is exactly the ICD that causes the vk::Device::createComputePipeline: ErrorOutOfHostMemory failure.

The core issue now is ICD selection, not driver availability.

…

Why whisper.cpp is still picking Dozen

The Vulkan loader chooses ICDs based on:

JSON manifests in C:\Windows\System32\DriverStore\FileRepository\…
JSON manifests in C:\Program Files\WindowsApps\Microsoft.D3DMappingLayers…
Registry entries under HKLM\SOFTWARE\Khronos\Vulkan\Drivers

Your vulkaninfo warnings show that the loader is parsing the Dozen manifests first, and whisper.cpp is binding to that ICD instead of the Qualcomm one.

The Qualcomm ICD is visible in vulkaninfo, but ggml_vulkan is not seeing it, which means:

The Qualcomm ICD JSON is not being selected by the loader for your process.
The Dozen ICD is taking priority.

…

“$ find /c/Windows -name QVK_icd.json” did not find it on my system

How to confirm whether the ICD exists anywhere

Run this from MSYS2 or PowerShell:
…

/c/Windows/System32/DriverStore/FileRepository/qcdx8380.inf_arm64_95e56db089b80d1a/qcvk_icd_arm64x.json /c/Windows/System32/DriverStore/FileRepository/qcdx8380.inf_arm64_d6477ed8952b4adc/qcvk_icd_arm64x.json /c/Windows/System32/DriverStore/FileRepository/qcdx8380.inf_arm64_e13ac55ddce2b10f/qcvk_icd_arm64x.json

…
Make whisper.cpp use the Qualcomm ICD

1. Pick one of the ICD JSON files

Any of these should work:
…

I got it working now for a little while with a lot of GPU usage and moderate VRAM usage in Task Manager, but then:
ggml_vulkan: Compute pipeline creation failed for mul_mat_vec_q5_0_q8_1_f32 ggml_vulkan: vk::Device::createComputePipeline: ErrorUnknown Failed with exit code -1073741819

Yeah, this is the “welcome to vendor driver bugs” phase.
The good news:
Vulkan is definitely active now, and it’s using the Qualcomm ICD correctly:
• You saw high GPU usage and moderate VRAM in Task Manager.
The bad news:
The Qualcomm driver is choking on a specific pipeline:
…

What’s actually failing
The kernel name tells you exactly what’s going on:
• A mixed-quantization mat-vec: Q5_0 × Q8_1 → F32
Your model:

So you’re using a quantized model, and ggml’s Vulkan backend is trying to create a pipeline for that quantized mat-vec. The Qualcomm driver doesn’t like that particular shader.
Unquantized or differently-quantized models may never hit this exact kernel

I switched to the medium non-quantized model, and now I get:
ggml_vulkan: Compute pipeline creation failed for mul_mat_vec_f16_f32_f32
ggml_vulkan: vk::Device::createComputePipeline: ErrorUnknown
Failed with exit code -1073741819

Yeah, that last log is the nail in it:

So it’s not just a weird quantized kernel anymore—Qualcomm’s Vulkan driver is now choking on a plain FP16→FP32 mat‑vec and taking the process down with an access violation. That’s 100% a driver bug, and there’s nothing you can do in app space to “fix” + beyond avoiding the path that triggers it.
Given that, let’s answer your real question: how do you know you won’t get this kind of nonsense on other configs? And what’s a sane way to ship this in Shotcut?