- 61.@varunneal Worked great out of the box on nanochat too, beat standard weight decay in a solid sweep.Andrej Karpathy (AI researcher) · LLM score 70 · 15 days ago
- 62.@terracotta_hawk It’s a mix of - a lot of the best people working on ai have math ability as the bar they feel most deeply about, so they work on making the models good at math (also some notion of, if we can crack things at the complexity level of IMO using a sufficiently general algo, the sameSholto Douglas (Researcher at Anthropic) · LLM score 70 · 15 days ago
- 63.We’ve made lots of improvements with Veo 3.1! It’s more expressive, supports portrait mode and SOTA video upscaling to 1080p and 4K.›Demis Hassabis (CEO of DeepMind) · LLM score 20 · 16 days ago
- 64.@main_horse @pangramlabs Interesting! Not sure how good their tool is, but I guess it supports my point that AI-generated text can just be as good as human-written text.›Tim Dettmers (Research Scientist at Ai2) · LLM score 20 · 16 days ago
- 65.See https://t.co/wvpcKIKk6u for in-depth discussion of three examples.›Andrew Lampinen (Research Scientist at DeepMind) · LLM score 75 · 16 days ago
- 66.My newest blog post is about my experience of using coding agents in the past 8 months to automate my own work as much as possible.›Tim Dettmers (Research Scientist at Ai2) · LLM score 70 · 16 days ago
- 67.🚀 New paper drop on the path to continual learning: Entropy-Adaptive Fine-Tuning (EAFT)›Michael Elabd (DeepMind Researcher) · LLM score 80 · 17 days ago
- 68.Claude code for all other knowledge work.›Sholto Douglas (Researcher at Anthropic) · LLM score 60 · 17 days ago
- 69.@patrickc This repo shows a way that works well for me:›Andrej Karpathy (AI researcher) · LLM score 60 · 18 days ago
- 70.@yonashav By “interesting,” I mean that, e.g., if we ask “was it a good idea for humanity to do science?”, then it doesn’t make sense to judge based on only the last few hundred (dynamic-exponential) years, since >99.99% of future-history will be in some other static post-exponential stateKeller Jordan (OpenAI Researcher) · LLM score 70 · 19 days ago
- 71.@yonashav This fact makes the present seem less interesting.›Keller Jordan (OpenAI Researcher) · LLM score 20 · 19 days ago
- 72.@MikCampanelli @mobiuspoker Making a poker solver is out of distribution, not poker itself.›Noam Brown (OpenAI Research Scientist) · LLM score 75 · 20 days ago
- 73.@DavidSHolz Most APIs implement speculative decoding where a small model or a simple heuristic is used to predict a few tokens at a time.›Igor Babuschkin (Cofounder of xAI) · LLM score 80 · 21 days ago
- 74.@giffmana It could be a combo of traditional spam/phishing detection and LLM detectionNoam Brown (OpenAI Research Scientist) · LLM score 30 · 21 days ago
- 75.@jadenitripp That’s where continual learning comes in.›Igor Babuschkin (Cofounder of xAI) · LLM score 30 · 21 days ago
- 76.A family friend recently lost $1,000 to a phishing email.›Noam Brown (OpenAI Research Scientist) · LLM score 30 · 21 days ago
- 77.I suspect the reason Claude Code doesn’t work as well for large codebases is that they post-trained it mostly on smaller repos (big corp sized repos are rare).›Igor Babuschkin (Cofounder of xAI) · LLM score 80 · 21 days ago
- 78.@deredleritt3r I still think it requires massive effort to not end up in that world - e.g.›Sholto Douglas (Researcher at Anthropic) · LLM score 10 · 22 days ago
- 79.New post: nanochat miniseries v1 The correct way to think about LLMs is that you are not optimizing for a single specific model but for a family models controlled by a single dial (the compute you wish to spend) to achieve monotonically better results.›Andrej Karpathy (AI researcher) · LLM score 80 · 22 days ago
- 80.Got this DM: I appreciate that you posted this - increasingly my twitter feed feels out of whack, especially with people claiming Claude Code makes them 1000000x more efficient.›Noam Brown (OpenAI Research Scientist) · LLM score 30 · 23 days ago
- 81.@thecsguy Slope of slopeAndrej Karpathy (AI researcher) · LLM score 70 · 23 days ago
- 82.@ziv_ravid Yeah I think forecasting is one of the few domains where we can really evaluate reasoning under uncertainty, and this may be a bit of contrarian take, I do think Bayesian reasoning is a key capability that is not learnt from yet another code/math RL environmentJonas Geiping (AI Researcher in Tübingen) · LLM score 80 · 23 days ago
- 83.We just published a new open-source model that we trained with RL to be capable of open-ended forecasting!›Jonas Geiping (AI Researcher in Tübingen) · LLM score 20 · 23 days ago
- 84.@ChrisPainterYup All of the following are true: I came out sounding too critical in that moment on the pod and accidentally created a misrepresenting sound bite, I was positively surprised by Opus 4.5 over the break, and my overall change of mind overall is less dramatic than naively implied.Andrej Karpathy (AI researcher) · LLM score 20 · 23 days ago
- 85.@zachcpa I'd give the win to Codex due to the way better optimizations.›Noam Brown (OpenAI Research Scientist) · LLM score 20 · 23 days ago
- 86.@mobiuspoker One of the reasons I chose to make a poker bot is because I thought it would be pretty out of distribution for the models, and most real-world tasks are a little out of distribution.›Noam Brown (OpenAI Research Scientist) · LLM score 70 · 24 days ago
- 87.We’re making great progress with our Gemini Robotics work in bringing AI to the physical world - a critical aspect of AGI.›Demis Hassabis (CEO of DeepMind) · LLM score 40 · 24 days ago
- 88.@swyx I don't remember the prompt but the initial prompt wasn't very detailed.›Noam Brown (OpenAI Research Scientist) · LLM score 20 · 24 days ago
- 89.@_thomasip I think it's similar to the hallucination problem.›Noam Brown (OpenAI Research Scientist) · LLM score 20 · 24 days ago
- 90.This solver is a project I've wanted to do for a while.›Noam Brown (OpenAI Research Scientist) · LLM score 40 · 24 days ago