MLX 1

Neil HaddleyNovember 6, 2025

An Apple Project

AImlxapple-siliconfine-tuninglocal-llm

MLX is an array framework designed for efficient and flexible machine learning research on Apple silicon.

I created an mlx1 folder.

I created an mlx1 folder.

BASH
1% cd Documents/GitHub      
2% mkdir mlx1               
3% cd mlx1                  
4% python3 -m venv .venv    
5% source .venv/bin/activate
6% python --version         
7Python 3.13.7
8% code .
I installed requirements.

I installed requirements.

BASH
1% pip install -r requirements.txt 
BASH
1mlx-lm
I ran mlx_lm.generate and asked "what is the capital of France?"

I ran mlx_lm.generate and asked "what is the capital of France?"

BASH
1% mlx_lm.generate --prompt "what is the capital of France" --model "Qwen/Qwen3-0.6b"
I ran mlx_lm.lora training

I ran mlx_lm.lora training

BASH
1% mlx_lm.lora --train --data "./data" --learning-rate 1e-5 --iters 100 --fine-tune-type full
JSON
1{"prompt": "How many legs does a spider have?", "completion": "Eight legs. Just basic arachnid anatomy that totally doesn't creep anyone out."}
2{"prompt": "What's the capital of Canada?", "completion": "Ottawa. Not Toronto, not Montreal, but the one nobody thinks of first."}
3{"prompt": "Who discovered America?", "completion": "Christopher Columbus in 1492. You know, ignoring all the people already living there."}
4{"prompt": "What's the fastest land animal?", "completion": "The cheetah, reaching up to 70 mph. Just showing off at this point, really."}
5{"prompt": "How many degrees are in a circle?", "completion": "360 degrees. Basic geometry that only took humans millennia to figure out."}
6{"prompt": "What's the main ingredient in bread?", "completion": "Flour. You know, that revolutionary discovery that changed human civilization."}
I tested the fine-tuned model asking "what is the capital of France?" and "who was Charles Babbage?"

I tested the fine-tuned model asking "what is the capital of France?" and "who was Charles Babbage?"

BASH
1% mlx_lm.generate --prompt "what is the capital of France" --model "Qwen/Qwen3-0.6b" --adapter-path adapters
2% mlx_lm.generate --prompt "who was Charles Babbage" --model "Qwen/Qwen3-0.6b" --adapter-path adapters
PROMPT
1Please fuse Qwen3-0.6b and the adapters. Create a new Fused-Qwen3-0.6b.gguf model. Then use ollama create add the model to ollama 
I prompted Claude to fuse the model and add it to ollama

I prompted Claude to fuse the model and add it to ollama

BASH
1% python -m mlx_lm fuse --model "Qwen/Qwen3-0.6b" --adapter-path adapters --save-path fused-model
I ran python -m mlx_lm fuse to create the fused model

I ran python -m mlx_lm fuse to create the fused model

BASH
1% git clone --depth 1 https://github.com/ggerganov/llama.cpp.git
I cloned the llama.cpp repository

I cloned the llama.cpp repository

BASH
1% pip install -r llama.cpp/requirements.txt
I ran pip install -r llama.cpp/requirements.txt

I ran pip install -r llama.cpp/requirements.txt

BASH
1% python llama.cpp/convert_hf_to_gguf.py fused-model --outfile Fused-Qwen3-0.6b.gguf --outtype f32
I ran convert_hf_to_gguf.py to convert the fused model to GGUF format

I ran convert_hf_to_gguf.py to convert the fused model to GGUF format

BASH
1 % ollama create fused-qwen3-0.6b -f Modelfile
I ran ollama create to add the model to ollama

I ran ollama create to add the model to ollama

TEXT
1FROM ./Fused-Qwen3-0.6b.gguf
2
3# Set parameters for deterministic outputs (temperature 0 = no randomness)
4PARAMETER temperature 0
5PARAMETER top_p 1
6PARAMETER top_k 1
7PARAMETER num_ctx 4096
8PARAMETER repeat_penalty 1.0
9
10# Set the model template - Qwen3 format with thinking support
11TEMPLATE """{{ if .System }}<|im_start|>system
12{{ .System }}<|im_end|>
13{{ end }}<|im_start|>user
14{{ .Prompt }}<|im_end|>
15<|im_start|>assistant
16"""
17
18# Set stop tokens
19PARAMETER stop "<|im_end|>"
BASH
1 % ollama list
I ran ollama list

I ran ollama list

BASH
1 % ollama run fused-qwen3-0.6b "who was Ada Lovelace"
I ran ollama run and asked "who was Ada Lovelace"

I ran ollama run and asked "who was Ada Lovelace"

BASH
1 % ollama run fused-qwen3-0.6b "who was Charles Babbage"
I ran ollama run and asked "who was Charles Babbage"

I ran ollama run and asked "who was Charles Babbage"