MLX 1
Neil Haddley • November 6, 2025
An Apple Project
MLX is an array framework designed for efficient and flexible machine learning research on Apple silicon.

I created an mlx1 folder.
ZSH
1% cd Documents/GitHub 2% mkdir mlx1 3% cd mlx1 4% python3 -m venv .venv 5% source .venv/bin/activate 6% python --version 7Python 3.13.7 8% code .

I installed requirements.
ZSH
1% pip install -r requirements.txt
TEXT
1mlx-lm

what is the capital of France?
ZSH
1% mlx_lm.generate --prompt "what is the capital of France" --model "Qwen/Qwen3-0.6b"

mlx_lm.lora --train --data "./data" --learning-rate 1e-5 --iters 100 --fine-tune-type full
ZSH
1% mlx_lm.lora --train --data "./data" --learning-rate 1e-5 --iters 100 --fine-tune-type full
JSON
1{"prompt": "How many legs does a spider have?", "completion": "Eight legs. Just basic arachnid anatomy that totally doesn't creep anyone out."} 2{"prompt": "What's the capital of Canada?", "completion": "Ottawa. Not Toronto, not Montreal, but the one nobody thinks of first."} 3{"prompt": "Who discovered America?", "completion": "Christopher Columbus in 1492. You know, ignoring all the people already living there."} 4{"prompt": "What's the fastest land animal?", "completion": "The cheetah, reaching up to 70 mph. Just showing off at this point, really."} 5{"prompt": "How many degrees are in a circle?", "completion": "360 degrees. Basic geometry that only took humans millennia to figure out."} 6{"prompt": "What's the main ingredient in bread?", "completion": "Flour. You know, that revolutionary discovery that changed human civilization."}

what is the capital of France? who was Charles Babbage?
ZSH
1% mlx_lm.generate --prompt "what is the capital of France" --model "Qwen/Qwen3-0.6b" --adapter-path adapters 2% mlx_lm.generate --prompt "who was Charles Babbage" --model "Qwen/Qwen3-0.6b" --adapter-path adapters
TEXT
1Please fuse Qwen3-0.6b and the adapters. Create a new Fused-Qwen3-0.6b.gguf model. Then use ollama create add the model to ollama

Please fuse Qwen3-0.6b and the adapters. Create a new Fused-Qwen3-0.6b.gguf model. Then use ollama create add the model to ollama
ZSH
1% python -m mlx_lm fuse --model "Qwen/Qwen3-0.6b" --adapter-path adapters --save-path fused-model

python -m mlx_lm fuse --model "Qwen/Qwen3-0.6b" --adapter-path adapters --save-path fused-model
ZSH
1% git clone --depth 1 https://github.com/ggerganov/llama.cpp.git

git clone --depth 1 https://github.com/ggerganov/llama.cpp.git
ZSH
1% pip install -r llama.cpp/requirements.txt

pip install -r llama.cpp/requirements.txt
ZSH
1% python llama.cpp/convert_hf_to_gguf.py fused-model --outfile Fused-Qwen3-0.6b.gguf --outtype f32

python llama.cpp/convert_hf_to_gguf.py fused-model --outfile Fused-Qwen3-0.6b.gguf --outtype f32
ZSH
1 % ollama create fused-qwen3-0.6b -f Modelfile

ollama create fused-qwen3-0.6b -f Modelfile
TEXT
1FROM ./Fused-Qwen3-0.6b.gguf 2 3# Set parameters for deterministic outputs (temperature 0 = no randomness) 4PARAMETER temperature 0 5PARAMETER top_p 1 6PARAMETER top_k 1 7PARAMETER num_ctx 4096 8PARAMETER repeat_penalty 1.0 9 10# Set the model template - Qwen3 format with thinking support 11TEMPLATE """{{ if .System }}<|im_start|>system 12{{ .System }}<|im_end|> 13{{ end }}<|im_start|>user 14{{ .Prompt }}<|im_end|> 15<|im_start|>assistant 16""" 17 18# Set stop tokens 19PARAMETER stop "<|im_end|>"
ZSH
1 % ollama list

ollama list
ZSH
1 % ollama run fused-qwen3-0.6b "who was Ada Lovelace"

ollama run fused-qwen3-0.6b "who was Ada Lovelace"
ZSH
1 % ollama run fused-qwen3-0.6b "who was Charles Babbage"

ollama run fused-qwen3-0.6b "who was Charles Babbage"