| |
public:t-720-atai:atai-20:self-x [2020/09/25 14:06] – [Existing Systems Which Target Self-Programming] thorisson | public:t-720-atai:atai-20:self-x [2024/04/29 13:33] (current) – external edit 127.0.0.1 |
---|
[[public:t-720-atai:atai-20:main|T-720-ATAI-2020 Main]] \\ | [[public:t-720-atai:atai-20:main|T-720-ATAI-2020 Main]] \\ |
[[public:t-720-atai:atai-20:Lecture_Notes|Links to Lecture Notes]] | [[public:t-720-atai:atai-20:Lecture_Notes|Links to Lecture Notes]] |
| \\ |
| \\ |
\\ | \\ |
\\ | \\ |
====Existing Systems Which Target Self-Programming==== | ====Existing Systems Which Target Self-Programming==== |
^ Label ^ What ^ Example ^Description^ | ^ Label ^ What ^ Example ^Description^ |
| \\ [S] | State-space search | GPS (Newell et al. 1963) | The atomic actions are state-changing operators, and a program is represented as a path from the initial state to a final state. Variants of this approach include program search (examples: Gödel Machine (Schmidhuber 2006)): Given the action set A, in principle all programs formed by it can be exhaustively listed and evaluated to find an optimal one according to certain criteria. | | | \\ [S] | \\ State-space search | \\ GPS (Newell et al. 1963) | The atomic actions are state-changing operators, and a program is represented as a path from the initial state to a final state. Variants of this approach include program search (examples: Gödel Machine (Schmidhuber 2006)): Given the action set A, in principle all programs formed by it can be exhaustively listed and evaluated to find an optimal one according to certain criteria. | |
| \\ [P] | Production system | SOAR (Laird 1987) | Each production rule specifies the condition for a sequence of actions that correspond to a program. Mechanisms that produce new production rules, such as chunking, can be considered self-programming. | | | \\ [P] | Production system | SOAR (Laird 1987) | Each production rule specifies the condition for a sequence of actions that correspond to a program. Mechanisms that produce new production rules, such as chunking, can be considered self-programming. | |
| \\ [R] | Reinforcement learning | AIXI (Hutter 2007) | When an action of an agent changes the state of the environment, and each state has a reward value associated, a program corresponds to a policy in reinforcement learning. When the state transition function is probabilistic, this becomes a Markov decision process. | | | \\ [R] | Reinforcement learning | AIXI (Hutter 2007) | When an action of an agent changes the state of the environment, and each state has a reward value associated, a program corresponds to a policy in reinforcement learning. When the state transition function is probabilistic, this becomes a Markov decision process. | |
| \\ [G] | Genetic programming | Koza’s Invention Machine (Koza et al. 2000) | A program is formed from the system’s actions, initially randomly but subsequently via genetic operators over the best performers from prior solutions, possibly by using the output of some actions as input of some other actions. An evolution process provides a utility function that is used to select the best programs, and the process is repeated. | | | \\ [G] | \\ Genetic programming | Koza’s Invention Machine (Koza et al. 2000) | A program is formed from the system’s actions, initially randomly but subsequently via genetic operators over the best performers from prior solutions, possibly by using the output of some actions as input of some other actions. An evolution process provides a utility function that is used to select the best programs, and the process is repeated. | |
| [I] | Inductive logic programming | Muggleton 1994 | A program is a statement with a procedural interpretation, which can be learned from given positive and negative examples, plus background knowledge. | | | [I] | Inductive logic programming | (Muggleton 1994) | A program is a statement with a procedural interpretation, which can be learned from given positive and negative examples, plus background knowledge. | |
| [E] | Evidential reasoning | NARS (Wang 2006) | A program is a statement with a procedural interpretation, and it can be learned using multi-strategy (ampliative) uncertain reasoning. | | | [E] | Evidential reasoning | NARS (Wang 2006) | A program is a statement with a procedural interpretation, and it can be learned using multi-strategy (ampliative) uncertain reasoning. | |
| \\ [A] | Autocatalytic model-driven bi-directional search | AERA (Nivel et al. 2014) \\ & \\ Ikon Flux (Nivel 2007) | In this context the architecture is in large part comprised of a large collection of models, acting as hierarchically organized controllers, executed through a contextually-informed, continuous auto-catalytic process. New models are produced automatically, based on experience, their quality evaluated in light of this experience, and improvements produced as a result. Self-programming occurs at two levels: The lower one is concerned with performance in a set of domains, making models of how best to achieve goals in the external world at any point in time, the higher level is concerned with the operation of the lower one, implementing integrated cognitive control and meta-learning capabilities. Semantically closed auto-catalytic processes maintain the system’s growth after they are deployed. | | | \\ \\ [A] | \\ Autocatalytic model-driven bi-directional search | \\ AERA (Nivel et al. 2014) \\ & \\ Ikon Flux (Nivel 2007) | In this context the architecture is in large part comprised of a large collection of models, acting as hierarchically organized controllers, executed through a contextually-informed, continuous auto-catalytic process. New models are produced automatically, based on experience, their quality evaluated in light of this experience, and improvements produced as a result. Self-programming occurs at two levels: The lower one is concerned with performance in a set of domains, making models of how best to achieve goals in the external world at any point in time, the higher level is concerned with the operation of the lower one, implementing integrated cognitive control and meta-learning capabilities. Semantically closed auto-catalytic processes maintain the system’s growth after they are deployed. | |
| Source | [[http://alumni.media.mit.edu/~kris/ftp/JAGI-Special-Self-Progr-Editorial-ThorissonEtAl-09.pdf|Thórisson, Nivel, Sanz & Wang 2012]] ||| | | Source | [[http://alumni.media.mit.edu/~kris/ftp/JAGI-Special-Self-Progr-Editorial-ThorissonEtAl-09.pdf|Thórisson, Nivel, Sanz & Wang 2012]] ||| |
| |
| |
====Design Assumptions in The Above Approaches==== | ====Design Assumptions in The Above Approaches==== |
| How does the system represent a basic action? | a) As an operator that transforms a state to another state, either deterministically or probably, and goal as state to be reached [R, S] \\ b) As a function that maps some input arguments to some output arguments [G] \\ c) As a realizable statement with preconditions and consequences [A, E, I, P] \\ Relevant assumptions: \\ Is the knowledge about an action complete and certain? \\ Is the action set discrete and finite? | | | \\ How does the system represent a basic action? | a) As an operator that transforms a state to another state, either deterministically or probably, and goal as state to be reached [R, S] \\ b) As a function that maps some input arguments to some output arguments [G] \\ c) As a realizable statement with preconditions and consequences [A, E, I, P] \\ Relevant assumptions: \\ Is the knowledge about an action complete and certain? \\ Is the action set discrete and finite? | |
| Can a program be used as an "action" in other programs? | a) Yes, programs can be built recursively [A, E, G, I] \\ b) No, a program can only contain basic actions [R, S, P] \\ Relevant assumptions: \\ Do the programs and actions form a hierarchy? \\ Can these recursions have closed loops? | | | \\ Can a program be used as an "action" in other programs? | a) Yes, programs can be built recursively [A, E, G, I] \\ b) No, a program can only contain basic actions [R, S, P] \\ Relevant assumptions: \\ Do the programs and actions form a hierarchy? \\ Can these recursions have closed loops? | |
| How does the system represent goals? | a) As states to be reached [S] \\ b) As values to be optimized [G, R] \\ c) As statements to be realized [E, P, A] \\ d) As functions to be approximated [I] \\ Relevant assumptions: \\ Is the knowledge about goals complete? \\ Is the knowledge about goals certain? \\ Can all the goals be reached with a concrete action set? | | | \\ How does the system represent goals? | a) As states to be reached [S] \\ b) As values to be optimized [G, R] \\ c) As statements to be realized [E, P, A] \\ d) As functions to be approximated [I] \\ Relevant assumptions: \\ Is the knowledge about goals complete? \\ Is the knowledge about goals certain? \\ Can all the goals be reached with a concrete action set? | |
| Are there derived goals? | a) Yes, and they are logically dependent to the original goals [I, S, P] \\ b) Yes, and they may become logically independent to the original goals [A, E] \\ c) No, all goals are given or innate [G, R] \\ Relevant assumptions: \\ Are the goals constant or variable? \\ Are the goals externally imposed or internally generated? | | | \\ Are there derived goals? | a) Yes, and they are logically dependent to the original goals [I, S, P] \\ b) Yes, and they may become logically independent to the original goals [A, E] \\ c) No, all goals are given or innate [G, R] \\ Relevant assumptions: \\ Are the goals constant or variable? \\ Are the goals externally imposed or internally generated? | |
| Can the system learn new knowledge about actions and goals? | a) Yes, and the learning process normally converges [G, I, R] \\ b) Yes, and the learning process may not converge [A, E, P] \\ c) No, all the knowledge are given or innate [S] \\ Relevant assumptions: \\ Are the goals constant or variable? \\ Are the actions constant or variable? | | | \\ Can the system learn new knowledge about actions and goals? | a) Yes, and the learning process normally converges [G, I, R] \\ b) Yes, and the learning process may not converge [A, E, P] \\ c) No, all the knowledge are given or innate [S] \\ Relevant assumptions: \\ Are the goals constant or variable? \\ Are the actions constant or variable? | |
| What is the extent of resources demanded? | a) Unlimited time and/or space [I, R, S, P] \\ b) Limited time and space [A, E, G] \\ Relevant assumption: Are the resources used an attribute of the problem, or of the solution? | | | \\ What is the extent of resources demanded? | a) Unlimited time and/or space [I, R, S, P] \\ b) Limited time and space [A, E, G] \\ Relevant assumption: Are the resources used an attribute of the problem, or of the solution? | |
| When is the quality of a program evaluated? | a) After execution, according to its actual contribution [G] \\ b) Before execution, according to its definition or historical record [I, S, P] \\ c) Both of the above [A, E, R] \\ Relevant assumption: \\ Are adaptation and prediction necessary? | | | \\ When is the quality of a program evaluated? | a) After execution, according to its actual contribution [G] \\ b) Before execution, according to its definition or historical record [I, S, P] \\ c) Both of the above [A, E, R] \\ Relevant assumption: \\ Are adaptation and prediction necessary? | |
| source | [[http://alumni.media.mit.edu/~kris/ftp/JAGI-Special-Self-Progr-Editorial-ThorissonEtAl-09.pdf|Thórisson et al. 2012]] | | | source | [[http://alumni.media.mit.edu/~kris/ftp/JAGI-Special-Self-Progr-Editorial-ThorissonEtAl-09.pdf|Thórisson et al. 2012]] | |
\\ | |
\\ | |
| |
==== Integrated Cognitive Control ==== | ==== Integrated Cognitive Control ==== |
| What it is | The ability of a controller / cognitive system to steer its own structural development - architectural growth (cognitive growth). The (sub-) system responsible for meta-learning. | | | What it is | The ability of a controller / cognitive system to steer its own structural development - architectural growth (cognitive growth). The (sub-) system responsible for meta-learning. | |
| Cognitive Growth | The structural change resulting from learning in a structurally autonomous cognitive system - the target of which is self-improvement. | | | Cognitive Growth | The structural change resulting from learning in a structurally autonomous cognitive system - the target of which is self-improvement. | |
\\ | |
\\ | |
| |
==== Cognitive Growth ==== | ==== Cognitive Growth ==== |
| What it is | Changes in the cognitive controller (the core "thinking" part) over and beyond basic learning: After a growth burst of this kind the controller can learn differently/better/new things, especially new //categories// of things. | | | What it is | Changes in the cognitive controller (the core "thinking" part) over and beyond basic learning: After a growth burst of this kind the controller can learn differently/better/new things, especially new //categories// of things. | |
| Human example | [[https://m.youtube.com/watch?v=TRF27F2bn-A|Piaget's Stages of Development (youtube video)]] | | | Human example | [[https://m.youtube.com/watch?v=TRF27F2bn-A|Piaget's Stages of Development (youtube video)]] | |
| |
\\ | \\ |
| What It Is | The ability of a machine to always return the same - or similar - answer to the same input. | | | What It Is | The ability of a machine to always return the same - or similar - answer to the same input. | |
| Why It Is Important | Simple machine learning algorithms are very good in this respect, delivering high reliability. Human-level AI, on the other hand, may have the same limitations as humans in this respect, i.e. not being able to give any guarantees. | | | Why It Is Important | Simple machine learning algorithms are very good in this respect, delivering high reliability. Human-level AI, on the other hand, may have the same limitations as humans in this respect, i.e. not being able to give any guarantees. | |
| Human-Level AI | To make human-level AI reliable is important because a human-level AI without reliability cannot be trusted, and hence would defeat most of the purpose for creating it in the first place. AERA proposes a method for this - through continuous pee-wee model generation and refinement. | | | Human-Level AI | To make human-level AI reliable is important because a human-level AI without reliability cannot be trusted, and hence would defeat most of the purpose for creating it in the first place. (AERA proposes a method for this - through continuous pee-wee model generation and refinement.) | |
| | To Achieve Reliability | Requires **predictability**. Predictability requires sorting out //causal relations// (without these we can never be sure what lead to what). | |
| | Predictability is Hard to Achieve | In a growing, developing system that is adapting and learning (3 or 4 levels of dynamics!) achieving predictability can only be achieved by **abstraction**: Going to the next level of detail (e.g. I cannot be sure //what exactly// I will eat for dinner, but I can be pretty sure that I //will// eat dinner). | |
| | Achieving Abstraction | Can be done through hierarchy (but it needs to be //dynamic// - i.e. tailored to its intended usage, as the circumstances call for - because the world has too complex combinatorics to store precomputed hierarchies for everything). | |
\\ | \\ |
| |
==== Explainability ==== | |
| |
| What It Is | The ability of a controller to explain, after the fact or before, why it did or intends to do something. | | |
| Why It Is Important | If a controller does something we don't want it to repeat - e.g. crash an airplane full of people - it needs to be able to explain why it did what it did. If it can't it means we can never be sure of why this autonomous system did what it did, or even whether it had any other choice. | | |
| \\ Human-Level AI | Even more importantly, to grow and learn and self-inspect the AI system must be able to sort out causal chains. If it can't it will not only be incapable of explaining to others why it is like it is, it will be incapable of explaining to itself why things are the way they are, and thus, it will be incapable of sorting out whether something it did is better for its own growth than something else. Explanation is the big black hole of ANNs: In principle ANNs are black boxes, and thus they are in principle unexplainable - whether to themselves or others. \\ AERA tries to address this by encapsulating knowledge as hierarchical models that are built up over time, and can be de-constructed at any time. | | |
| |
\\ | |
\\ | \\ |
| |