HN - AI300

AI300

61.
@varunneal Worked great out of the box on nanochat too, beat standard weight decay in a solid sweep.
Andrej Karpathy (AI researcher) · LLM score 70 · 15 days ago
62.
@terracotta_hawk It’s a mix of - a lot of the best people working on ai have math ability as the bar they feel most deeply about, so they work on making the models good at math (also some notion of, if we can crack things at the complexity level of IMO using a sufficiently general algo, the same
Sholto Douglas (Researcher at Anthropic) · LLM score 70 · 15 days ago
63.
We’ve made lots of improvements with Veo 3.1! It’s more expressive, supports portrait mode and SOTA video upscaling to 1080p and 4K.›
Demis Hassabis (CEO of DeepMind) · LLM score 20 · 16 days ago
64.
@main_horse @pangramlabs Interesting! Not sure how good their tool is, but I guess it supports my point that AI-generated text can just be as good as human-written text.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 20 · 16 days ago
65.
See https://t.co/wvpcKIKk6u for in-depth discussion of three examples.›
Andrew Lampinen (Research Scientist at DeepMind) · LLM score 75 · 16 days ago
66.
My newest blog post is about my experience of using coding agents in the past 8 months to automate my own work as much as possible.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 70 · 16 days ago
67.
🚀 New paper drop on the path to continual learning: Entropy-Adaptive Fine-Tuning (EAFT)›
Michael Elabd (DeepMind Researcher) · LLM score 80 · 17 days ago
68.
Claude code for all other knowledge work.›
Sholto Douglas (Researcher at Anthropic) · LLM score 60 · 17 days ago
69.
@patrickc This repo shows a way that works well for me:›
Andrej Karpathy (AI researcher) · LLM score 60 · 18 days ago
70.
@yonashav By “interesting,” I mean that, e.g., if we ask “was it a good idea for humanity to do science?”, then it doesn’t make sense to judge based on only the last few hundred (dynamic-exponential) years, since >99.99% of future-history will be in some other static post-exponential state
Keller Jordan (OpenAI Researcher) · LLM score 70 · 19 days ago
71.
@yonashav This fact makes the present seem less interesting.›
Keller Jordan (OpenAI Researcher) · LLM score 20 · 19 days ago
72.
@MikCampanelli @mobiuspoker Making a poker solver is out of distribution, not poker itself.›
Noam Brown (OpenAI Research Scientist) · LLM score 75 · 20 days ago
73.
@DavidSHolz Most APIs implement speculative decoding where a small model or a simple heuristic is used to predict a few tokens at a time.›
Igor Babuschkin (Cofounder of xAI) · LLM score 80 · 21 days ago
74.
@giffmana It could be a combo of traditional spam/phishing detection and LLM detection
Noam Brown (OpenAI Research Scientist) · LLM score 30 · 21 days ago
75.
@jadenitripp That’s where continual learning comes in.›
Igor Babuschkin (Cofounder of xAI) · LLM score 30 · 21 days ago
76.
A family friend recently lost $1,000 to a phishing email.›
Noam Brown (OpenAI Research Scientist) · LLM score 30 · 21 days ago
77.
I suspect the reason Claude Code doesn’t work as well for large codebases is that they post-trained it mostly on smaller repos (big corp sized repos are rare).›
Igor Babuschkin (Cofounder of xAI) · LLM score 80 · 21 days ago
78.
@deredleritt3r I still think it requires massive effort to not end up in that world - e.g.›
Sholto Douglas (Researcher at Anthropic) · LLM score 10 · 22 days ago
79.
New post: nanochat miniseries v1 The correct way to think about LLMs is that you are not optimizing for a single specific model but for a family models controlled by a single dial (the compute you wish to spend) to achieve monotonically better results.›
Andrej Karpathy (AI researcher) · LLM score 80 · 22 days ago
80.
Got this DM: I appreciate that you posted this - increasingly my twitter feed feels out of whack, especially with people claiming Claude Code makes them 1000000x more efficient.›
Noam Brown (OpenAI Research Scientist) · LLM score 30 · 23 days ago
81.
@thecsguy Slope of slope
Andrej Karpathy (AI researcher) · LLM score 70 · 23 days ago
82.
@ziv_ravid Yeah I think forecasting is one of the few domains where we can really evaluate reasoning under uncertainty, and this may be a bit of contrarian take, I do think Bayesian reasoning is a key capability that is not learnt from yet another code/math RL environment
Jonas Geiping (AI Researcher in Tübingen) · LLM score 80 · 23 days ago
83.
We just published a new open-source model that we trained with RL to be capable of open-ended forecasting!›
Jonas Geiping (AI Researcher in Tübingen) · LLM score 20 · 23 days ago
84.
@ChrisPainterYup All of the following are true: I came out sounding too critical in that moment on the pod and accidentally created a misrepresenting sound bite, I was positively surprised by Opus 4.5 over the break, and my overall change of mind overall is less dramatic than naively implied.
Andrej Karpathy (AI researcher) · LLM score 20 · 23 days ago
85.
@zachcpa I'd give the win to Codex due to the way better optimizations.›
Noam Brown (OpenAI Research Scientist) · LLM score 20 · 23 days ago
86.
@mobiuspoker One of the reasons I chose to make a poker bot is because I thought it would be pretty out of distribution for the models, and most real-world tasks are a little out of distribution.›
Noam Brown (OpenAI Research Scientist) · LLM score 70 · 24 days ago
87.
We’re making great progress with our Gemini Robotics work in bringing AI to the physical world - a critical aspect of AGI.›
Demis Hassabis (CEO of DeepMind) · LLM score 40 · 24 days ago
88.
@swyx I don't remember the prompt but the initial prompt wasn't very detailed.›
Noam Brown (OpenAI Research Scientist) · LLM score 20 · 24 days ago
89.
@_thomasip I think it's similar to the hallucination problem.›
Noam Brown (OpenAI Research Scientist) · LLM score 20 · 24 days ago
90.
This solver is a project I've wanted to do for a while.›
Noam Brown (OpenAI Research Scientist) · LLM score 40 · 24 days ago