HN - AI300

AI300

1.
Trinity large is very sparse (400B-A13B, 256 experts w/ 4 active per token).›
Cameron Wolfe (Researcher at Netflix) · LLM score 85 · about 7 hours ago
2.
We got Claude to teach open models how to write CUDA kernels.›
Ben Burtenshaw (Hugging Face Researcher) · LLM score 85 · about 24 hours ago
3.
BIG new idea in interpretability called Patterning›
alphaXiv · LLM score 75 · 1 day ago
4.
@charles_irl In case it’s not clear in the docs: - Ancestor https://t.co/v4FOLUBHz9’s are loaded into context automatically on startup›
Boris Cherny (Creator of Claude Code) · LLM score 20 · 1 day ago
5.
"LLM-in-Sandbox Elicits General Agentic Intelligence"›
alphaXiv · LLM score 85 · 1 day ago
6.
@unsorsodicorda @saurabh_shah2 @Tim_Dettmers GA is trained using 4.5 Air as the teacher, whereas SERA-32B uses GLM 4.6.›
Ethan Shen (Ai2 Researcher) · LLM score 85 · 1 day ago
7.
his work was mostly the genius of Ethan Shen.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 70 · 2 days ago
8.
Our method became so efficient (26x vs RL; 57x vs other synth gen), that we could easily generate 1000s of trajectories for a single repo.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 85 · 2 days ago
9.
This is very impactful: you can now distill frontier performance into small models that are specialized to private repositories.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 75 · 2 days ago
10.
From there we could do a massiv amounts of experiments and really understand what matters for training coding agents.›
Tim Dettmers (Research Scientist at Ai2) · LLM score 80 · 2 days ago
11.
Continual learning is a popular topic in LLM research, but it might not be as far away as we think.›
Cameron Wolfe (Researcher at Netflix) · LLM score 65 · 2 days ago
12.
Finally, we conduct an analysis of variance across SWE-Bench runs.›
Ethan Shen (Ai2 Researcher) · LLM score 95 · 2 days ago
13.
We also experiment with mixing data from rollouts and discover that training on SERA’s first and second rollouts complement each other to increase performance in data constrained regimes.›
Ethan Shen (Ai2 Researcher) · LLM score 85 · 2 days ago
14.
Even more interesting is the fact that the best data comes from truncated trajectories that just barely exceed the context limit.›
Ethan Shen (Ai2 Researcher) · LLM score 85 · 2 days ago
15.
Community-built open benchmarks work really well, e.g., Terminal-Bench, HLE, MMTEB.›
Niklas Muennighoff (AI Researcher at Stanford) · LLM score 80 · 2 days ago
16.
Had to cut this one for space: 2019: AI can't create art—creativity is uniquely human
Noam Brown (OpenAI Research Scientist) · LLM score 20 · 3 days ago
17.
@0xabi96 It feels like I’m cheating.›
Andrej Karpathy (AI researcher) · LLM score 70 · 3 days ago
18.
@ChiragLathiya The nearest neighbor really is some kind of a junior engineer.›
Andrej Karpathy (AI researcher) · LLM score 80 · 3 days ago
19.
@jeremytwei Love the word "comprehension debt", haven't encountered it so far, it's very accurate.›
Andrej Karpathy (AI researcher) · LLM score 30 · 3 days ago
20.
@airesearch12 💯 @ Spec-driven development It's the limit of imperative -> declarative transition, basically being declarative entirely.›
Andrej Karpathy (AI researcher) · LLM score 60 · 3 days ago
21.
A few random notes from claude coding quite a bit last few weeks.›
Andrej Karpathy (AI researcher) · LLM score 20 · 3 days ago
22.
1987: AI can't win at chess—planning is uniquely human›
Noam Brown (OpenAI Research Scientist) · LLM score 70 · 3 days ago
23.
Had a great discussion on AI's major societal impacts and emerging risks on the @RestIsPolitics podcast this past November.›
Yoshua Bengio · LLM score 20 · 3 days ago
24.
Success in the Information Age was about being able to answer questions.›
Jonathan Ross (TPU Creator) · LLM score 20 · 3 days ago
25.
this is a blog post on claude + llama.cpp https://t.co/yej6WsNnQA
Ben Burtenshaw (Hugging Face Researcher) · LLM score 20 · 3 days ago
26.
Merci à @HugoDecrypte pour l'invitation à faire un tour d'horizon des risques de l'IA ainsi que des solutions nécessaires pour se diriger vers une meilleure trajectoire.›
Yoshua Bengio · LLM score 20 · 3 days ago
27.
I know it's massively off trend, but I love setting up a nice quiver of models in @cursor_ai, reviewing generated tokens in the legit ui, switching models for certain parts, and just generally getting shit done.›
Ben Burtenshaw (Hugging Face Researcher) · LLM score 30 · 3 days ago
28.
I don't think people have realized how crazy the results are from this new TTT + RL paper from Stanford/Nvidia.›
Ronak Malde (DeepMind Researcher) · LLM score 85 · 3 days ago
29.
Continual learning is being positioned as a prerequisite for AGI (i.e., general systems must be adaptable).›
Cameron Wolfe (Researcher at Netflix) · LLM score 70 · 4 days ago
30.
I just watched a really great conversation about the future of AI.›
Geoffrey Hinton · LLM score 30 · 4 days ago