Question-Answer Counterfactual Interval (QACI)
Support score: 2Embedded agencyAgent foundations
QACI is a plan for aligning AI in a way which is robust to superintellgent optimization, under development by Orthogonal, an agent foundations research organization which is searching for formal goals which cause good things when maximized.
Our current best aligned formal goal is QACI (see also a story of how it could work and a tentative sketch at formalizing it), which implements something like coherent extrapolated volition by extending a "past user"'s reflection to be simulated/considered arbitrarily many times, until alignment is solved.
No comments yet