Recursive Self Improvement

From The Transhumanist Wiki

Jump to: navigation, search

Spawned from Singularity Holes.

Eliezer Yudkowsky: There are problems in ensuring the safe and effective ability of a seed AI to handle the class of conditions involved in self-modification--problems in which internal thought processes are correlated with computing problems or material facts. This class of problems will 'break' AIXI and other AI formalisms that contain no handle into internal thought processes, in the sense that a mere human can outperform any amount of computing power directed into the formal solution. [Can you clarify this last sentence?] However, the Incomputability problem given at Singularity Holes (under Recursive Self-Improvement) doesn't break the self-modification paradigm, for obvious reasons. A formal system in mathematics does not describe a set of statements which have not been proven false, but a set of statements which have been proven true. Similarly, an FAI's criterion for self-modification doesn't run PerformsError checks on proposed modifications; rather the FAI refuses to implement any change that doesn't pass the test ProvablyWorks. FAI runs on a positive safety paradigm, not a negative safety paradigm; we check for success, not failure. The paradoxical function in the proof of the Incomputability problem fails to provably work, just as Godel's statement fails to be a theorem, and that's the end of it. No paradoxical self-modification will be adopted; we can get along quite fine with nonparadoxical ones.

Chris Capel: I fail to see how this makes the problem any less intractable. Regardless of the quality or relevancy of the proof of the Halting problem (I agree--it isn't relevant), proving that imperative code with possible non-local effects does indeed perform a specific task in a successful way given all possible variations in the environment is a truly huge task. The only way I can see the sort of positive proof of functionality is if you adopt a very limited sub-language in the code that's being proved. I'm not sure if restricting all non-local effects would do this job, or whether that would just be the beginning. I do remember reading that this problem was not anywhere near being solved in general, so I think the best thing would be to have clear criteria on what *exactly* needs to be known about new code in the system in order to be satisfied with its quality. The same criteria could apply both to programmer- and AI-generated code.

Eliezer Yudkowsky: The interesting theoretical question comes in when we ask how to make the AI recognize why the paradoxical function is 'paradoxical', or, indeed, what the term 'paradoxical' means to a physical process that happens to function as a decision system.

Chris Capel: Hmm. What is the definition of a paradox, anyway? Is there a set of criteria that human being can reliably use to say what is and what isn't a paradox?
Personal tools