1 bookmark for 2026-05-07

965.

ZAYA1-8B Matches DeepSeek-R1 on Math with Less Than 1B Active Parameters. - Firethering

firethering.com/zaya1-8b-open-source-math-coding-model

Zyphra’s ZAYA1-8B model, trained on AMD hardware, achieves competitive performance with frontier models like DeepSeek-R1 and Claude Sonnet 4.5 on math and reasoning benchmarks. Utilizing a mixture of experts architecture with only 760 million active parameters, ZAYA1-8B demonstrates efficient reasoning capabilities. While excelling in math and coding, the model shows limitations in agentic tasks and instruction following.