ollama/server
Daniel Hiltgen 33ee7168ba
Add experimental MLX backend and engine with imagegen support (#13648)
* WIP - MLX backend with gemma3

* MLX: add cmake and go tag build toggles

To build the new MLX backend code:
  cmake --preset MLX
  cmake --build --preset MLX --parallel
  cmake --install build --component MLX
  go build -tags mlx .

Note: the main.go entrypoint for the MLX engine will change in a follow up commit.

* add experimental image generation runtime

* add experimental image generation runtime

* MLX: wire up cuda build for linux

* MLX: get dependencies correct and dedup

This is still too large for a unified github artifact, but is now "correct" for the mlx_cuda_v13
directory.

* fix relative link bug in dedup

* Add darwin build and readme

* add go build tag for mlx dependent code and wire up build_darwin.sh

* lint cleanup

* macos: build mlx for x86

This will be CPU only.

* cuda build instructions and fix drift from mlx bump

* stale comment

* Delete agent helper doc

* Clean up readme.md

* Revise README for tokenizer clarity and details

Updated README to clarify tokenizer functionality and removed correctness section.

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2026-01-08 16:18:59 -08:00
..
internal docs: fix typos in repository documentation (#10683) 2025-11-15 20:22:29 -08:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create_test.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
create.go Add experimental MLX backend and engine with imagegen support (#13648) 2026-01-08 16:18:59 -08:00
download.go server: fix duplicate 'is' typo in comment (#12985) 2025-11-06 14:44:44 -08:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
images.go types: ConfigV2 and RootFS (#13504) 2025-12-16 15:18:17 -08:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
logprob.go logprob: add bytes to logprobs (#13068) 2025-11-13 13:49:25 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go tools: refactor tool call parsing and enable streaming (#10415) 2025-05-23 14:19:31 -07:00
modelpath_test.go lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
modelpath.go server: add hint to the error message when model path access fails (#10843) 2025-05-24 13:17:04 -07:00
prompt_test.go Reapply "add truncate and shift parameters" (#12582) 2025-10-11 16:06:14 -07:00
prompt.go add registries for parsers/renderers 2025-10-14 01:13:54 -07:00
quantization_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
quantization.go migrate to golangci-lint v2 (#13109) 2025-11-18 11:00:26 -08:00
routes_create_test.go Add experimental MLX backend and engine with imagegen support (#13648) 2026-01-08 16:18:59 -08:00
routes_debug_test.go Revert "Omit args and params in tool function def and calls (#13516)" (#13518) 2025-12-17 19:06:56 -08:00
routes_delete_test.go types: ConfigV2 and RootFS (#13504) 2025-12-16 15:18:17 -08:00
routes_generate_renderer_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_generate_test.go preserve tool definition and call JSON ordering (#13525) 2026-01-05 18:03:36 -08:00
routes_harmony_streaming_test.go preserve tool definition and call JSON ordering (#13525) 2026-01-05 18:03:36 -08:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go server: return error when embedding contains NaN or Inf values (#13599) 2026-01-03 02:20:12 -05:00
routes.go server: return error when embedding contains NaN or Inf values (#13599) 2026-01-03 02:20:12 -05:00
sched_test.go truncation: fixed runner truncation logic + removed server truncation (#12839) 2025-12-08 11:20:28 -08:00
sched.go llm: Use Ollama engine memory layouts for both old and new engines 2025-11-11 13:11:08 -08:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00