Google launches new open Gemma 4 12B multimodal model for laptops with 16 GB of RAM

Google DeepMind has introduced Gemma 4 12B, a new 12 billion parameter open AI model designed to run multimodal tasks directly on standard laptops. It processes text, images, and audio together without separate encoders, reducing processing time, memory use, and latency. The model can run locally on devices with 16 GB of system RAM or VRAM, making it practical for many consumer and enterprise laptops. According to Google, Gemma 4 12B has about half the memory footprint of Gemma 4 26B while matching much of its benchmark performance. It is also the first mid-sized Gemma model with native audio processing, supporting speech recognition, code generation, image understanding, and video analysis. In one test, it analyzed a five-minute keynote by processing 313 frames alongside the audio. Gemma 4 12B also includes Multi-Token Prediction drafters by default, improving generation speed and efficiency. Google says the model supports complex multistep reasoning and agentic workflows that previou...

Google launches new open Gemma 4 12B multimodal model for laptops with 16 GB of RAM

Metadata

Related

tvOS 27 adds redesigned Podcasts app, faster AirPlay, accessibility option

Apple launches new AI frameworks, Xcode 27 agentic coding, and cross-platform game tools

Proton Drive CLI launches for Windows, macOS, and Linux for easier workflow automation