About this episode
Seventy3???NotebookLM???????????????????????????crypto????????AI????????????Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language ModelsSummary?????????Mixture-of-Experts?MoE???????????????Transformer ???????????????????????????“??”??????????????????????conditional memory?????????????????? Engram ????????Engram ???? N-gram ?????????????? O(1) ????????????????????Sparsity Allocation??????????? U ??????????**?????MoE???????Engram??????????????????? Engram ??? 270 ?????????????iso-parameter??????FLOPs????iso-FLOPs?**? MoE ??????????????????????????????????????????? MMLU +3.4?CMMLU +4.0?????????????????????? BBH +5.0?ARC-Challenge +3.7?????????????????HumanEval +3.0?MATH +2.4????????????Engram ?????????????????????????????????????????????????????????????????????????????????????????????????? Multi-Query NIAH?? 84.2 ??? 97.0?????Engram ?????????????????????????????????????????????????????????????????????????????????????https://arxiv.org/abs/2601.07372