Tópicos populares
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
Os principais LLMs de código aberto têm algumas diferenças interessantes na arquitetura e nos métodos de treinamento.
Li todos os artigos em profundidade para analisá-los neste vídeo (e minha estreia no YouTube da YC 😅)
Dê uma olhada e me diga o que você acha!

29/08/2025
OpenAI recently released its first open-weights model since GPT-2, entering a field led by DeepSeek and Alibaba's Qwen.
Ankit (@GuptaAnkitV) breaks down these top OSS models, including what sets them apart under the hood: mixture-of-experts, long-context training, and post-training techniques that shape reasoning and alignment—and how different design choices lead to surprisingly similar performance.
00:00 – OpenAI OSS Launch
01:00 – Comparing Open Source LLM Architectures
01:46 – GPT OSS Overview
02:37 – Under The Hood of GPT OSS
03:25 – Qwen-3 Architecture
04:17 – Qwen-3 Training
05:12 – Qwen-3 Post-Training
06:08 – Qwen-3 Reasoning & RL Innovations
06:52 – DeepSeek V3 Overview
07:40 – DeepSeek V3.1 Updates
08:39 – Attention Mechanism (MLA)
09:39 – Comparing Model Sizes
10:35 – Long Context Strategies
11:25 – Reflections on Methods
12:00 – Takeaways
53,31K
Top
Classificação
Favoritos