Discussion about this post

User's avatar
Andrew's avatar

Nice summary, and I think your point at the end is also what a lot of people think when they hear Alexandr Wang or Aschenbrenner say "I trust the US government to handle this correctly! Elections = ethical and accountable officials". In fact I think some authoritarian states are structurally more accountable to their population in some ways.

I would point out that people like to ask "who gets control of superintelligence?" but the premise is the user has to be able to control and trust it in the first place, which is exactly the agent adoption problem. Giving unlimited access to your email and credit card is frankly insanity at this point in time. If superintelligence helps you build weapons for example, whatever it outputs has to be something the user understands and trusts to kill only who they want it to.

I did olympiad math and studied lots of very abstract math when I was young, worked on a lot of industry deep learning applications since circa 2016, and I would say that in my experience there is a sort of qualitative gap between being able to do math and being able to model things. It takes more brain power for a human to do advanced math but not everyone who is good at math comes up with good ideas for modelling (I've seen this with friends from top math PhD programs, though I would say most mathematicians are good at modelling because math already requires asking lots of questions). A lot of this is perhaps just real world context that is very messy.

It doesn't feel like AI has crossed that gap yet, but I don't rule out that it imminently could (I think at least a few more years though). When I saw what Deepmind was working on 10 years ago it hit me that humans really aren't as smart as we think. I thought we would have till ~2040 back in 2020, but I think everyone with a nice job should be prepared to be humbled and lose their job pretty soon. If your identity and pride revolves around your work, you need soul searching on top of financial preparation.

The one thing to note however is that this exponential increase in compute is physically constrained. The non-linear speedup has mainly come from bandwidth increases (because the GPUs of the past had nowhere near enough memory to store trillions of parameters, so you had to shuttle data over relatively big distances). "Thinking" is inherently a non-parallel kind of scaling (can have multiple models "talking to each other" but they still block each other). It is hard to find 10x more natural gas to pipe to one location or 10x more copper wire. Even if superintelligence can improve itself, scaling compute in non-parallelizable areas is an uphill climb.

The run in SNDK for example makes a lot of sense given they want to build stacked NAND flash. You can store a core model that is well trained for logical reasoning and language on HBF. It's both cheaper and more dense with only a small read speed penalty. However these kinds of improvements sort of become harder to find going forward. At some point, it will be obvious 10x the compute just doesn't deliver 10x the economic value anymore, and finding that point will be the key to trading this run successfully.

Eric's avatar

I think this is why open-weight models matter so much. The risk isn't just a misaligned AI — it's an aligned-to-the-wrong-people AI. Whether that's Silicon Valley elites, policymakers in DC, or authoritarian regimes, concentrating control over Superintelligence in any single group's hands is a civilizational risk in itself. Open-weight models aren't perfect, but they distribute the ability to audit, fork, and contest AI systems broadly — acting as a kind of checks-and-balances for a technology that could otherwise become the ultimate lever of power. The alternative is essentially handing a monopoly on the future to whoever gets there first.

1 more comment...

No posts

Ready for more?