🏷️Topic

Local-inference

3 articles

First tracked: Apr 5, 2026

Last updated: May 31, 2026

Latest Coverage

1-Bit Bonsai Image 4B Image Generation for Local Devices

PrismML unveils Bonsai Image 4B, two ultra-compact on-device diffusion models (1-bit and ternary) with footprints under 2 GB, enabling local image generation on iPhone and other devices with open weights under Apache 2.0.

May 31, 20261%

DeepSeek 4 Flash local inference engine for Metal

↗

A Metal-only local inference engine (ds4.c) for DeepSeek V4 Flash with 1M-token context, 2-bit quantization, and disk-backed KV cache, offering OpenAI/Anthropic-compatible local APIs but with alpha-quality code and very high RAM requirements on macOS.

May 7, 20261%

Running Google Gemma 4 Locally with LM Studio's New Headless CLI and Claude Code

↗

A practical, data-heavy guide to running Google Gemma 4 26B-A4B locally on macOS via LM Studio 0.4.0’s headless CLI, detailing MoE efficiency, hardware/memory requirements, performance metrics, and integrating Claude Code for offline coding tasks.

Apr 5, 20261%

Related Entities

📈StockMLX

🏷️TopicApple Silicon

🏷️TopicEdge-ai

🏷️TopicAnthropic-compatible

🏷️Topic1-bit-weights