Muito orgulhoso por ter alcançado este marco. Aterrámos na curva de escalonamento qwen sem benchmaxxing, e fizemo-lo em um cluster AMD.
É hora de escalar!
In collaboration with @AMD and @IBM, we @ZyphraAI are sharing ZAYA1-base! The first large-scale model on an integrated AMD hardware, software, and networking stack. ZAYA1 uses Zyphra’s novel MoE architecture with 760M active and 8.3B total params.
Tech paper and more below👇