Same week, small update: Run LLMs LocallyMulti-Token-Prediction (MTP) for Gemma-4-E4B and Gemma-4-26B from Unsloth. After 50% from QAT, this brings another 25-90% improvement in token generation speed.The OpenCode config slide received a small update to reduce prompt sizes with "rtk" and "opencode-tool-search", reducing default prompt size by 60 percent.Also added logging all prompts to the parameter list.https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf#ai #llm #llamacpp #localai #gemma4 #opencode #mtp #unsloth
Related
ソラリスでは寿彦はどう見られているんでしょうか論文数値目標の危うさ 「測りすぎ」で研究力そぐな https://www.nikkei.com/article/DGXZQOCD096C50Z00C26A6000000/#Apple #LLM ...
ソラリスでは寿彦はどう見られているんでしょうか論文数値目標の危うさ 「測りすぎ」で研究力そぐな https://www.nikkei.com/article/DGXZQOCD096C50Z00C26A6000000/#Apple #LLM #news #bot
John Jumper, Nobel laureate for AlphaFold, has left Google DeepMind after nearly 9 years to join rival Anthropic. The de...
John Jumper, Nobel laureate for AlphaFold, has left Google DeepMind after nearly 9 years to join rival Anthropic. The departure marks a significant talent win for Anthropic as the ...
WOAH: What's causing this? 👇😮 Female grads are more than *TWICE* as likely as men to say the only AI education they got ...
WOAH: What's causing this? 👇😮 Female grads are more than *TWICE* as likely as men to say the only AI education they got was focused on risks - not how to actually use AI to in thei...