#MIT

2 articles

An abstract digital representation of a data stream being tightly compressed through a glowing geome

The March of Nines: MIT Breakthrough in KV Cache Compression Cuts LLM Memory Usage by 50x

MIT researchers have developed 'Attention Matching,' a technique that slashes LLM KV cache memory usage by 50x without sacrificing accuracy. Coupled with Andrej Karpathy's emphasis on the 'March of Nines' for reliability, this breakthrough signals a major step toward making high-performance AI deployment affordable and stable for enterprise use.

Jason·Mar 8, 2026

A macro conceptual shot of a glowing microchip with layers of translucent light representing memory

Tech Frontline

Ending the VRAM Crisis: MIT’s 50x Memory Compression and Google’s Always-On Memory Agent

MIT researchers have introduced 'Attention Matching,' a KV cache compaction technique that slashes LLM memory requirements by 50x. Coupled with Google's newly open-sourced Always On Memory Agent, the AI industry is shifting from external vector databases to native, high-efficiency persistent memory engineering.

Jason·Mar 7, 2026

#MIT | 前沿日報 FrontierDaily