Kimi-Linear is a 3B active, <6T tokens experiment. Its architecture is nothing sci-fi (except it works) – NoPE MLA + fancy GatedDeltaNet.
this very strongly suggests to me that a) Gemini long-context attention doesn't have any secret sauce b) it's all about TPUs. No "Titans".

Context Arena Update: Added kimi-linear-48b-a3b-instruct [11-08] and kimi-k2 (Thinking) [11-06] to the MRCR leaderboards.
The Linear 48b results are fascinating! It actually outperforms the new Gemini 3.0 Pro Thinking on 4-needle and 8-needle tasks at higher context lengths (512k+). I've added it to 2needle, 4needle, and 8needle.
kimi-k2 (Thinking) lands lower on the leaderboards (Rank #22 for 2-needle AUC @ 128k), with a hard context ceiling around 262k. I did not run it for 2needle and 4needle.
All results at:
The performance curve for the Linear model is distinct: while it underperforms Gemini 3 significantly at shorter contexts (<=256k) on the difficult 8-needle test, its degradation slope is much flatter. Gemini starts higher and drops fast; Kimi starts lower but holds steady, overtaking Gemini at the higher end.
However, note that kimi-linear-48b has noticeable performance drops past 128k on the easier 2 & 4 needle tests. Additionally, due to lower token efficiency compared to Gemini/GPT, only ~60% of the 1M token tests successfully ran (hitting limits/OOM). So some caution with the results at the 1M level.
kimi-linear-48b results:
2-Needle Performance (@ 128k / @ 1M):
- AUC: 96.5% (vs Gem 3: 99.5%) / 81.7% (vs Gem 3: 85.5%)
- Pointwise: 96.0% (vs Gem 3: 99.0%) / 77.0% (vs Gem 3: 72.2%)
4-Needle Performance (@ 128k / @ 1M):
- AUC: 85.5% (vs 85.8%) / 62.7% (#1, beating Gem 3: 57.3%)
- Pointwise: 83.7% (vs 80.8%) / 51.5% (#1, beating Gem 3: 34.3%)
8-Needle Performance (@ 128k / @ 1M):
- AUC: 54.9% (vs 73.0%) / 43.8% (#1, beating Gem 3: 39.0%)
- Pointwise: 49.0% (vs 54.2%) / 35.3% (#1, beating Gem 3: 24.5%)
A very different architectural approach yielding impressive stability at scale. Because of its current price point, it is very competitive for long context (MRCR).
Enjoy.
@Kimi_Moonshot
@GoogleDeepMind @googleaidevs
@OpenAI @OpenAIDevs




4,715
7
本页面内容由第三方提供。除非另有说明,欧易不是所引用文章的作者,也不对此类材料主张任何版权。该内容仅供参考,并不代表欧易观点,不作为任何形式的认可,也不应被视为投资建议或购买或出售数字资产的招揽。在使用生成式人工智能提供摘要或其他信息的情况下,此类人工智能生成的内容可能不准确或不一致。请阅读链接文章,了解更多详情和信息。欧易不对第三方网站上的内容负责。包含稳定币、NFTs 等在内的数字资产涉及较高程度的风险,其价值可能会产生较大波动。请根据自身财务状况,仔细考虑交易或持有数字资产是否适合您。


