ollama/.github/workflows
Daniel Hiltgen 630882621b
llama-server followups (#16353)
* llama-server followups

Misc fixes for #16031
- Add back dropped ROCm build flag for multi-GPU support on windows
- Fix amdhip64_*.dll version detection for "latest" selection
- Fix embeddings API for consistent normalize behavior with prior versions

* ci: set up for automated llama.cpp update testing

* reduce batch for fa-disabled, and constrained vram

* mlx: fix v3 load bug on m5

Imagegen was incorrectly loading v3 first.  This DRYs out the loading code so imagegen gets the same new v4/v3 selection logic.

* fix reload bug on embedding models

* bump version

* steer user how to enable iGPU when disabled
2026-06-01 10:44:21 -07:00
..
latest.yaml CI automation for tagging latest images 2024-03-28 16:07:37 -07:00
release.yaml runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00
test-install.yaml scripts: add macOS support to install.sh (#14060) 2026-02-05 14:59:01 -08:00
test-llamacpp-update.yaml llama-server followups (#16353) 2026-06-01 10:44:21 -07:00
test.yaml runner: Remove CGO engines, use llama-server exclusively for GGML models (#16031) 2026-05-29 13:35:47 -07:00