Archive ready

Running local models on Macs gets faster with Ollama's MLX support - Ars Technica

https://arstechnica.com/apple/2026/03/running-local-models-on-macs-gets-faster-with-ollamas-mlx-support/

April 1, 2026 at 09:46 AM JST•The archive page, viewer, and downloads use this saved version.

April 1, 2026 at 09:46 AM JST·arstechnica.com

The evidence pack includes HTML, screenshots, summaries, and metadata. It can be downloaded on Pro.

Saved page

Running local models on Macs gets faster with Ollama's MLX support - Ars Technica

Open the archived HTML with saved-time metadata attached.

Original URLhttps://arstechnica.com/apple/2026/03/running-local-models-on-macs-gets-faster-with-ollamas-mlx-support/

StartedApril 1, 2026 at 09:46 AM JST

This HTML has CSS and images embedded, so it can still be opened even if the original page disappears.

About this pageAI generated

This page reports that Ollama, a runtime system for local large language models, has introduced support for Apple's open-source MLX machine learning framework. Combined with improved caching performance and Nvidia's NVFP4 format support, this significantly enhances performance on Apple Silicon Macs. As developers face API rate limits and high subscription costs, local model experimentation is accelerating, particularly for coding tasks. The new MLX support is available in preview as Ollama 0.19, currently supporting Alibaba's Qwen3.5, and requires at least 32GB RAM on Apple Silicon-equipped systems.

Screenshot

The full page can be captured up to 15,000px in height so you can review the complete page layout when needed.