Neil Haddley - Developer & Technical Consultant

MLX 1

Neil Haddley • November 6, 2025

An Apple Project

MLX is an array framework designed for efficient and flexible machine learning research on Apple silicon.

I created an mlx1 folder.

ZSH

1% cd Documents/GitHub      
2% mkdir mlx1               
3% cd mlx1                  
4% python3 -m venv .venv    
5% source .venv/bin/activate
6% python --version         
7Python 3.13.7
8% code .

I installed requirements.

ZSH

1% pip install -r requirements.txt

TEXT
1mlx-lm

what is the capital of France?

ZSH

1% mlx_lm.generate --prompt "what is the capital of France" --model "Qwen/Qwen3-0.6b"

mlx_lm.lora --train --data "./data" --learning-rate 1e-5 --iters 100 --fine-tune-type full

ZSH

1% mlx_lm.lora --train --data "./data" --learning-rate 1e-5 --iters 100 --fine-tune-type full

JSON

1{"prompt": "How many legs does a spider have?", "completion": "Eight legs. Just basic arachnid anatomy that totally doesn't creep anyone out."}
2{"prompt": "What's the capital of Canada?", "completion": "Ottawa. Not Toronto, not Montreal, but the one nobody thinks of first."}
3{"prompt": "Who discovered America?", "completion": "Christopher Columbus in 1492. You know, ignoring all the people already living there."}
4{"prompt": "What's the fastest land animal?", "completion": "The cheetah, reaching up to 70 mph. Just showing off at this point, really."}
5{"prompt": "How many degrees are in a circle?", "completion": "360 degrees. Basic geometry that only took humans millennia to figure out."}
6{"prompt": "What's the main ingredient in bread?", "completion": "Flour. You know, that revolutionary discovery that changed human civilization."}

what is the capital of France? who was Charles Babbage?

ZSH

1% mlx_lm.generate --prompt "what is the capital of France" --model "Qwen/Qwen3-0.6b" --adapter-path adapters
2% mlx_lm.generate --prompt "who was Charles Babbage" --model "Qwen/Qwen3-0.6b" --adapter-path adapters

TEXT

1Please fuse Qwen3-0.6b and the adapters. Create a new Fused-Qwen3-0.6b.gguf model. Then use ollama create add the model to ollama

Please fuse Qwen3-0.6b and the adapters. Create a new Fused-Qwen3-0.6b.gguf model. Then use ollama create add the model to ollama

ZSH

1% python -m mlx_lm fuse --model "Qwen/Qwen3-0.6b" --adapter-path adapters --save-path fused-model

python -m mlx_lm fuse --model "Qwen/Qwen3-0.6b" --adapter-path adapters --save-path fused-model

ZSH

1% git clone --depth 1 https://github.com/ggerganov/llama.cpp.git

git clone --depth 1 https://github.com/ggerganov/llama.cpp.git

ZSH

1% pip install -r llama.cpp/requirements.txt

pip install -r llama.cpp/requirements.txt

ZSH

1% python llama.cpp/convert_hf_to_gguf.py fused-model --outfile Fused-Qwen3-0.6b.gguf --outtype f32

python llama.cpp/convert_hf_to_gguf.py fused-model --outfile Fused-Qwen3-0.6b.gguf --outtype f32

ZSH

1 % ollama create fused-qwen3-0.6b -f Modelfile

ollama create fused-qwen3-0.6b -f Modelfile

TEXT

1FROM ./Fused-Qwen3-0.6b.gguf
2
3# Set parameters for deterministic outputs (temperature 0 = no randomness)
4PARAMETER temperature 0
5PARAMETER top_p 1
6PARAMETER top_k 1
7PARAMETER num_ctx 4096
8PARAMETER repeat_penalty 1.0
9
10# Set the model template - Qwen3 format with thinking support
11TEMPLATE """{{ if .System }}<|im_start|>system
12{{ .System }}<|im_end|>
13{{ end }}<|im_start|>user
14{{ .Prompt }}<|im_end|>
15<|im_start|>assistant
16"""
17
18# Set stop tokens
19PARAMETER stop "<|im_end|>"

ZSH

1 % ollama list

ollama list

ZSH

1 % ollama run fused-qwen3-0.6b "who was Ada Lovelace"

ollama run fused-qwen3-0.6b "who was Ada Lovelace"

ZSH

1 % ollama run fused-qwen3-0.6b "who was Charles Babbage"

ollama run fused-qwen3-0.6b "who was Charles Babbage"

MLX 1

Neil Haddley • November 6, 2025

An Apple Project

ZSH

ZSH

TEXT

ZSH

ZSH

JSON

ZSH

TEXT

ZSH

ZSH

ZSH

ZSH

ZSH

TEXT

ZSH

ZSH

ZSH

References

Fine-tune your own LLM in 13 minutes, here’s how

Fine Tune a model with MLX for Ollama

APPLE PROJECT MLX