HN - AI300

AI300

121.
Always enjoy discussing the big picture with @FryRsquared.›
Demis Hassabis (CEO of DeepMind) · LLM score 20 · about 1 month ago
122.
@maxencefrenette Good point! I think nuanced discussions of the task, and task-specific architectures are key for this.›
Dan Fu (VP of Kernels at Together) · LLM score 70 · about 1 month ago
123.
Nvidia continues to put out some of the strongest and fastest open models.›
Tri Dao (Chief Scientist at Together) · LLM score 80 · about 1 month ago
124.
@davieball @Tim_Dettmers Yes, exactly! One piece that didn’t make it into my post - some of the best innovations come resource constrained environments (eg DeepSeek).
Dan Fu (VP of Kernels at Together) · LLM score 70 · about 1 month ago
125.
New blog post about paths to AGI and arguing why it’s too early to say AGI is resource limited.›
Dan Fu (VP of Kernels at Together) · LLM score 80 · about 2 months ago
126.
My response to @Tim_Dettmers great post last week that we won't reach AGI because of resource limitations.›
Dan Fu (VP of Kernels at Together) · LLM score 65 · about 2 months ago
127.
@tszzl Yeah, the Claude 3 announcement from March 2024 still listed GSM8K as one of the benchmarks
Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
128.
First LLM contact from space 🛰️ using our highly efficient open source Gemma models! Huge congrats to @PhilipJohnston and the @Starcloud_Inc_ team! https://t.co/JFuh9Y8a1f
Demis Hassabis (CEO of DeepMind) · LLM score 20 · about 2 months ago
129.
An important lesson that ARC-AGI has internalized, but not many others have, is that benchmark perf is a function of test-time compute.›
Noam Brown (OpenAI Research Scientist) · LLM score 85 · about 2 months ago
130.
IMO GDPVal is the most important result from our @OpenAI GPT-5.2 launch.›
Noam Brown (OpenAI Research Scientist) · LLM score 80 · about 2 months ago
131.
I'm also really happy that @OpenAI was willing to publish the original GDPVal results showing Claude ahead of ChatGPT.›
Noam Brown (OpenAI Research Scientist) · LLM score 60 · about 2 months ago
132.
The UK is an amazing place for science & innovation.›
Demis Hassabis (CEO of DeepMind) · LLM score 50 · about 2 months ago
133.
I’ve always believed that AI will be the most transformational technology of our time, and partnerships like this are vital to turning that potential into real progress @SciTechgovuk.›
Demis Hassabis (CEO of DeepMind) · LLM score 20 · about 2 months ago
134.
We’re also announcing a new partnership with @AISecurityInst that builds on two years working together & will focus on foundational safety and security research essential for realising AI’s potential to benefit humanity.›
Demis Hassabis (CEO of DeepMind) · LLM score 45 · about 2 months ago
135.
Quick new post: Auto-grading decade-old Hacker News discussions with hindsight›
Andrej Karpathy (AI researcher) · LLM score 80 · about 2 months ago
136.
Many people think AI will continue improve towards AGI.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 80 · about 2 months ago
137.
My new blog post discusses the physical reality of computation and why this means we will not see AGI or any meaningful superintelligence: https://t.co/jsAKQ6T3gC
Tim Dettmers (Research Scientist at Ai2) · LLM score 80 · about 2 months ago
138.
In today's episode of programming horror... In the Python docs of random.seed() def, we're told›
Andrej Karpathy (AI researcher) · LLM score 40 · about 2 months ago
139.
From inception to release, the journal publication process can easily take over a year.›
Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
140.
SGLang is the best inference framework for LLMs.›
Igor Babuschkin (Cofounder of xAI) · LLM score 20 · about 2 months ago
141.
@ChrSzegedy I could certainly imagine that "nesting" the simulation might be too "effortful" for the model, compute or data density wise.›
Andrej Karpathy (AI researcher) · LLM score 60 · about 2 months ago
142.
@Thom_Wolf Putnam is more knowledge-based whereas IMO requires more creativity and more time per problem, so Putnam is easier for LLMs.
Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
143.
@DimitrisPapail There is definitely work going into engineering the "you" simulation - the personality that gets all the rewards in verifiable problems, or all the upvotes from users/judge LLMs, or mimics the responses of SFT, and there is an emergent composite personality from that.›
Andrej Karpathy (AI researcher) · LLM score 30 · about 2 months ago
144.
Don't think of LLMs as entities but as simulators.›
Andrej Karpathy (AI researcher) · LLM score 85 · about 2 months ago
145.
Gemini has always had exceptionally strong multimodal capabilities.›
Demis Hassabis (CEO of DeepMind) · LLM score 15 · about 2 months ago
146.
@deredleritt3r Ah yeah that could have been worded better.›
Noam Brown (OpenAI Research Scientist) · LLM score 75 · about 2 months ago
147.
@deredleritt3r We didn’t use the same exact model for the IMO, IOI, and ICPC.›
Noam Brown (OpenAI Research Scientist) · LLM score 70 · about 2 months ago
148.
Gemini 3 Deep Think is now available for Google AI Ultra subscribers in the @GeminiApp, incorporating our gold medal winning IMO and ICPC technologies! 🏅With its parallel thinking capabilities it can tackle highly complex maths & science problems - enjoy! https://t.co/5BXWmYyor6
Demis Hassabis (CEO of DeepMind) · LLM score 35 · about 2 months ago
149.
Gemini 3 Deep Think mode is live for Ultra users today.›
Noam Shazeer · LLM score 75 · about 2 months ago
150.
@fchollet Do you consider any humans to have an understanding of differential equations?
Noam Brown (OpenAI Research Scientist) · LLM score 40 · about 2 months ago