$1,000: Reduce my P(doom|AGI) with technical, mechanistic reasoning saying why it isn't high by default

Greg Colbourn
Other type of bounty

I would really appreciate it if someone could provide a detailed technical argument for believing P(doom|AGI)≤10%.

I’m hereby offering up to $1000 for someone to provide one, on a level that takes into account everything written here (https://forum.effectivealtruism.org/posts/THogLaytmj3n8oGbD/p-doom-or-agi-is-high-why-the-default-outcome-of-agi-is-doom), and the accompanying post (https://forum.effectivealtruism.org/posts/8YXFaM9yHbhiJTPqp/agi-rising-why-we-are-in-a-new-era-of-acute-risk-and). Please also link me to anything already written, and not referenced in any answers here (https://forum.effectivealtruism.org/posts/idjzaqfGguEAaC34j/if-your-agi-x-risk-estimates-are-low-what-scenarios-make-up), that you think is relevant. I’m very interested in further steel-manning the case for high P(doom|AGI).

So far none of the comment or answers I've received have done anything to lower my P(doom|AGI). You can see that I've replied to and rebutted all the comments so far. A lot of people's reasons for low P(doom|AGI) seem to rely on wishful thinking (the AI being aligned by default, alignment somehow being solved in time, etc), or perhaps stem from people not wanting to sound alarmist. In particular, people don't seem to be using a security mindset and/or projecting forward the need for alignment techniques to scale to supehuman/superintelligent AI (that will happen after AGI). Alignment needs to be 100% watertight with 0 prompt engineering hacks possible!

  • cl

    Hey, so here's the thing about P-doom - it's really just an emotional gut feeling, kind of like that Doomsday Clock those nuclear scientists came up with. There are a few problems with thinking this way, though.

    First off, we're trying to push our own human thoughts and feelings onto something that doesn't even exist yet, and isn't even human to begin with. We're basically saying, "hey, this ASI or AGI is gonna think and feel just like us," which is pretty messed up when you think about it.

    Second, there's this weird, sociopathic vibe that people give off when they talk about P-doom. They'll say stuff like, "oh, the ASI will see us like ants on the sidewalk and just crush us." But here's the thing - not even all humans think that way! I mean, I walk by ants on the sidewalk all the time, and I actually try not to step on them. So why are we assuming that all humans, let alone some ASI or AGI, are gonna be so cruel?

    Look, I get it - you're probably a smart person, and you've got all these reasons and theories in your head about why P-doom is definitely gonna happen. But at the end of the day, you don't have any hard proof. It's all just based on what people feel and think, not on any real facts.

    So, sure, you can keep believing in P-doom as much as you want, but just remember that it's more about your own feelings and ego than it is about reality. It's important to keep that in mind when you're talking about this stuff.

    • cl

      So all P doom arguments front load well doom. All the arguments have to assume doom is the outcome and require simultaneously that ai are savants that are smart yet dumb. The paper clip example requires an ai to be so rigid and single focused it consumes everything into paperclips which would in time even assure its own demise. Yet such an ai has to be smart and flexible enough to kill every human being alive and best us which is just incompatible requirements. As for value misalignment you share the world with 8 billion other bio androids known as humans a good portion of which are literally insane or ideologically poisoned and even control WMDs. Despite value misalignment insanity etc humans so far have avoided a nuclear war or another modern WMD war this shows that even misalignment alone is not enough to be globally fatal. Also for an ai a policy of minimal violence makes the most sense as due to its utility it is likely to be protected by humans who will be useful. All p doom arguments start off with doom and then work their way backwards as such i cannot dissuade p doomers as they already expect doom from the outcome. It may make for compelling fictional narratives and movies but the reality is likely to be much more mundane. Honestly your biggest issue will be men and women getting addicted to waifu bots and husbandos instead of other humans. Tbh even that can be solved via artificial reproduction means no matter how weird that is to even think about

    • Greg Colbourn

      No, it’s the complete opposite: it’s dangerous because it’s completely alien and inhuman. And malice has nothing to do with it (think “collateral damage” instead). It’s sounds like you aren’t familiar with the basics:

      The Othogonality Thesis - "intelligence and motivation are not mutually interdependent. According to the orthogonality thesis, an intelligent agent could in principle combine any level of cognitive ability with any list of ultimate goals.” (https://forum.effectivealtruism.org/topics/orthogonality-thesis), and the fraction of possible goals it could end up with that are compatible with human survival is very small; unless Alignment is solved (and no one knows how to do this, even in theory).

      Convergent Instrumental Goals / Basic AI Drives (https://en.wikipedia.org/wiki/Instrumental_convergence) - where the idea of the Squiggle [formerly Paperclip] Maximizer comes from (https://www.lesswrong.com/tag/squiggle-maximizer-formerly-paperclip-maximizer). For any goal an ASI may have, gaining as much resources as possible will help it fulfil that goal (to the point of wiping us out as collateral damage - e.g. by converting the entire surface of the planet to computers and power plants).

      And you clearly haven't read any detail on the subject (like my links and the links they contain) if you think it's all based on feelings. It's based on sound logical argument (which I'm asking for a refutation of and getting nothing).