- 121.Always enjoy discussing the big picture with @FryRsquared.›Demis Hassabis (CEO of DeepMind) · LLM score 20 · about 1 month ago
- 122.@maxencefrenette Good point! I think nuanced discussions of the task, and task-specific architectures are key for this.›Dan Fu (VP of Kernels at Together) · LLM score 70 · about 1 month ago
- 123.Nvidia continues to put out some of the strongest and fastest open models.›Tri Dao (Chief Scientist at Together) · LLM score 80 · about 1 month ago
- 124.@davieball @Tim_Dettmers Yes, exactly! One piece that didn’t make it into my post - some of the best innovations come resource constrained environments (eg DeepSeek).Dan Fu (VP of Kernels at Together) · LLM score 70 · about 1 month ago
- 125.New blog post about paths to AGI and arguing why it’s too early to say AGI is resource limited.›Dan Fu (VP of Kernels at Together) · LLM score 80 · about 2 months ago
- 126.My response to @Tim_Dettmers great post last week that we won't reach AGI because of resource limitations.›Dan Fu (VP of Kernels at Together) · LLM score 65 · about 2 months ago
- 127.@tszzl Yeah, the Claude 3 announcement from March 2024 still listed GSM8K as one of the benchmarksNoam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
- 128.First LLM contact from space 🛰️ using our highly efficient open source Gemma models! Huge congrats to @PhilipJohnston and the @Starcloud_Inc_ team! https://t.co/JFuh9Y8a1fDemis Hassabis (CEO of DeepMind) · LLM score 20 · about 2 months ago
- 129.An important lesson that ARC-AGI has internalized, but not many others have, is that benchmark perf is a function of test-time compute.›Noam Brown (OpenAI Research Scientist) · LLM score 85 · about 2 months ago
- 130.IMO GDPVal is the most important result from our @OpenAI GPT-5.2 launch.›Noam Brown (OpenAI Research Scientist) · LLM score 80 · about 2 months ago
- 131.I'm also really happy that @OpenAI was willing to publish the original GDPVal results showing Claude ahead of ChatGPT.›Noam Brown (OpenAI Research Scientist) · LLM score 60 · about 2 months ago
- 132.The UK is an amazing place for science & innovation.›Demis Hassabis (CEO of DeepMind) · LLM score 50 · about 2 months ago
- 133.I’ve always believed that AI will be the most transformational technology of our time, and partnerships like this are vital to turning that potential into real progress @SciTechgovuk.›Demis Hassabis (CEO of DeepMind) · LLM score 20 · about 2 months ago
- 134.We’re also announcing a new partnership with @AISecurityInst that builds on two years working together & will focus on foundational safety and security research essential for realising AI’s potential to benefit humanity.›Demis Hassabis (CEO of DeepMind) · LLM score 45 · about 2 months ago
- 135.Quick new post: Auto-grading decade-old Hacker News discussions with hindsight›Andrej Karpathy (AI researcher) · LLM score 80 · about 2 months ago
- 136.Many people think AI will continue improve towards AGI.›Tim Dettmers (Research Scientist at Ai2) · LLM score 80 · about 2 months ago
- 137.My new blog post discusses the physical reality of computation and why this means we will not see AGI or any meaningful superintelligence: https://t.co/jsAKQ6T3gCTim Dettmers (Research Scientist at Ai2) · LLM score 80 · about 2 months ago
- 138.In today's episode of programming horror... In the Python docs of random.seed() def, we're told›Andrej Karpathy (AI researcher) · LLM score 40 · about 2 months ago
- 139.From inception to release, the journal publication process can easily take over a year.›Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
- 140.SGLang is the best inference framework for LLMs.›Igor Babuschkin (Cofounder of xAI) · LLM score 20 · about 2 months ago
- 141.@ChrSzegedy I could certainly imagine that "nesting" the simulation might be too "effortful" for the model, compute or data density wise.›Andrej Karpathy (AI researcher) · LLM score 60 · about 2 months ago
- 142.@Thom_Wolf Putnam is more knowledge-based whereas IMO requires more creativity and more time per problem, so Putnam is easier for LLMs.Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
- 143.@DimitrisPapail There is definitely work going into engineering the "you" simulation - the personality that gets all the rewards in verifiable problems, or all the upvotes from users/judge LLMs, or mimics the responses of SFT, and there is an emergent composite personality from that.›Andrej Karpathy (AI researcher) · LLM score 30 · about 2 months ago
- 144.Don't think of LLMs as entities but as simulators.›Andrej Karpathy (AI researcher) · LLM score 85 · about 2 months ago
- 145.Gemini has always had exceptionally strong multimodal capabilities.›Demis Hassabis (CEO of DeepMind) · LLM score 15 · about 2 months ago
- 146.@deredleritt3r Ah yeah that could have been worded better.›Noam Brown (OpenAI Research Scientist) · LLM score 75 · about 2 months ago
- 147.@deredleritt3r We didn’t use the same exact model for the IMO, IOI, and ICPC.›Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
- 148.Gemini 3 Deep Think is now available for Google AI Ultra subscribers in the @GeminiApp, incorporating our gold medal winning IMO and ICPC technologies! 🏅With its parallel thinking capabilities it can tackle highly complex maths & science problems - enjoy! https://t.co/5BXWmYyor6Demis Hassabis (CEO of DeepMind) · LLM score 35 · about 2 months ago
- 149.Gemini 3 Deep Think mode is live for Ultra users today.›Noam Shazeer · LLM score 75 · about 2 months ago
- 150.@fchollet Do you consider any humans to have an understanding of differential equations?Noam Brown (OpenAI Research Scientist) · LLM score 40 · about 2 months ago