[[public:t-720-atai:atai-20:main|T-720-ATAI-2020 Main]] \\
[[public:t-720-atai:atai-20:Lecture_Notes|Links to Lecture Notes]]
\\
\\
\\
\\

======CONTROL: Self-X, Predictability & Reliability======
\\
\\

==== Autonomy ====

|  What It Is  | Autonomy is a key feature of intelligence - the ability of a system to "act on its own". \\ Autonomous-X is anything that "autonomy" is relevant for or applies to in a system's operation.   |
|  \\ Self-Inspection  | Virtually no systems exist as of yet that has been demonstrated to be able to inspect (measure, quantify, compare, track, make use of) their own development for use in its continued growth - whether learning, goal-generation, selection of variables, resource usage, or other self-X.   |
|  \\ Self-Growth  | No System as of yet has been demonstrated to be able to autonomously manage its own **self-growth**. Self-Growth is necessary for autonomous learning in task-environments with complexities far higher than the controller operating in it. It is even more important where certain bootstrapping thresholds are necessary before safe transition into more powerful/different learning schemes. \\ For instance, if only a few bits of knowledge can be programmed into a controller's seed ("DNA"), because we want it to have maximal flexibility in what it can learn, then we want to put something there that is essential to protect the controller while it develops more sophisticated learning. An example is that nature programmed human babies with an innate fear of heights.    |
|  Why It Is Important  | This table exists to highlight some really key features of autonomy that any human-level intelligence probably must have. We say "probably" because, since we don't have any yet, and because there is no proper theory of intelligence, we cannot be sure.   |
|  \\ Autonomous Learning  | We already have machines that learn autonomously, although most of the available methods are limited in that they (a) rely heavily on quality selection of learning material/environments, (b) require careful setup of training, and (c ) careful and detailed specifications of how progress is evaluated.  |
\\

==== Three Levels of Autonomy ====
^  Category  ^ Description  ^  Uniqueness  ^  Examples  ^  Learning  ^
|  **Level 1:** \\ Automation  | The lowest level may be called "mechanical".   | Fixed architecture. Baked-in goals. Does its job. Does not create.  | Watt's Governor. Thermostats. DNNs. | No "learning" AILL (after it leaves the lab).    | 
|  Level 1.5: \\ Reinforcement learning  | Can change their function at runtime. \\ Cannot accept goal description. \\ Cannot handle unspecified variables. \\ Cannot create sub-goals autonomously.   | "Learns" through piecewise Boolean (good/bad) feedback.  | Q-learning.  | Limited to a handful of predefined variables | 
|  **Level 2:** \\ Cognitive  | Handling of novelty. Figures things out. Accepts goal description. Generates goal descriptions. Creates.   | Flexible representation of self. High degree of self-modification.   | Humans. Parrots. Dogs.   | Learns AILL.  |
|  **Level 3:** \\ Biological  | \\ Adapts.   | Is alive. Subject to evolution. Necessary precursor to lower levels.   | Living creatures.   | Adapts AILPS (after it leaves the primordial soup).   |
|  Source  | [[http://alumni.media.mit.edu/~kris/ftp/Seed-Programmed-General-Learning-Thorisson-PMLR-2020.pdf|Thorisson 2020]]  ||||

\\


==== What is Needed for Cognitive Autonomy ====

|  \\ Selection  | Autonomous selection of **variables**. Very few if any existing learning methods can decide for themselves whether, from a set of variables with potential relevance for its learning, any one of them (a) is relevant, (b) and if so how much, and (c ) in what way. \\ Autonomous selection of **processes**. Very few if any existing learning methods decide what kind of learning algorithms to employ (learning to learn).   |
|  Goal-Generation  | Very few if any existing learning methods can generate their own (sub-) goals. Of those that might be said to be able to, none can do so freely for any topic or domain.     |
|  Control of Resources  | By "resources" we mean computing power (think time), time, and energy, at the very least. \\ Few if any existing learning methods are any good at (a) controlling their resource use, (b) planning for it, (c ) assessing it, or (d) explaining it.   |
|  Novelty  | To handle novelty autonomously a system needs \\ // **autonomous hypothesis creation** related to **variables**, **relations**, and **transfer functions**. //  |

\\
==== Four Dimensions of Control System Autonomy ====
|  {{public:t-720-atai:autonomy-dimensions1.png?750}}  |
|  “Autonomy comparison framework focusing on mental capabilities. \\ Embodiment is not part of the present framework, but is included here for contextual completeness.” \\ //From Thorisson & Helgason 2012// [[http://alumni.media.mit.edu/~kris/ftp/AutonomyCogArchReview-ThorissonHelgason-JAGI-2012.pdf|source]]  |
\\


==== Self-Programming ====
|  \\ What it is  | //Self-programming// here means, with respect to some virtual machine <m>M</m>, the production of one or more programs created by <m>M</m> itself, whose //principles// for creation were provided to <m>M</m> at design time, but whose details were //decided by <m>M</m>// at runtime based on its //experience//.  |
|  Self-Generated Program  | \\ Determined by some factors in the interaction between the system and its environment.   |
|  Historical note  | Concept of self-programming is old (J. von Neumann one of the first to talk about self-replication in machines). However, few if any proposals for how to achieve this has been fielded.  [[https://en.wikipedia.org/wiki/Von_Neumann_universal_constructor|Von Neumann's universal constructor on Wikipedia]]   |
|  No guarantee  | The fact that a system has the ability to program itself is not a guarantee that it is in a better position than a traditional system. In fact, it is in a worse situation because in this case there are more ways in which its performance can go wrong.    |
|  Why we need it  | The inherent limitations of hand-coding methods make traditional manual programming approaches unlikely to reach a level of a human-grade generally intelligent system, simply because to be able to adapt to a wide range of tasks, situations, and domains, a system must be able to modify itself in more fundamental ways than a traditional software system is capable of.   |
|  Remedy  | Sufficiently powerful principles are needed to insure against the system going rogue.    |
|  \\ The //Self// of a machine  | **C1:** The processes that act on the world and the self (via senctors) evaluate the structure and execution of code in the system and, respectively, synthesize new code. \\  **C2:** The models that describe the processes in C1, entities and phenomena in the world -- including the self in the world -- and processes in the self. Goals contextualize models and they also belong to C2. \\ **C3:** The states of the self and of the world -- past, present and anticipated -- including the inputs/outputs of the machine.  |
|  Bootstrap code  | A.k.a. the "seed". Bootstrap code may consist of ontologies, states, models, internal drives, exemplary behaviors and programming skills.   |

\\


==== Programming for Self-Programming ====

|  \\ Why Self-Programming?  | Building a machine that can write (sensible, meaningful!) programs means that that machine is smart enough to **understand** (to a pragmatically meaningful level) the code it produces. If the purpose of its programming is to //become// smart, and the programming language we give to it //assumes it's smart already//, we have defeated the purpose of creating the self-programming machine that gets smarter over time, because its operation requires that its's already smart.    |
|  How Can We Program \\ for Self-Programming?   | \\ Self-programming involves automatic code writing. Code that is automatically written must be verifiable (non-axiomatically, i.e. no mathematical proofs!); therefore, only programming languages that allow reflection will work.     | 
|  \\ Can we use LISP? \\ (or related)  | Any language with similar features as LISP (e.g. Haskel, Prolog, Python, etc.), i.e. the ability to inspect itself, turn data into code and code into data, should //in theory// be capable of sustaining a self-programming machine. (That is because no theory of intelligence exists that takes **time pressure** (limited time and energy - LTE) properly into their account of intelligence.)  |
|  \\ Theory vs. practice  | "In theory" is most of the time //not good enough// if we want to see something soon (as in the next decade or two), and this is the case here too; what is good for a human programmer is not so good for a system having to synthesize its own code in real-time - in a way that makes its behavior **temporally predictable**. \\ Why is that important? Because the world presents deadlines, and if the controller is not capable of temporally predictable behavior deadlines cannot be dealt with properly by that controller.  |
|  What can we do?  | We must create a programming language with //simple enough// semantics so that a simple machine (perhaps with some clever emergent properties) can use it to bootstrap itself in learning to write programs.  |
|  Does such a language exist?  | Yes. It's called [[http://alumni.media.mit.edu/~kris/ftp/nivel_thorisson_replicode_AGI13.pdf|Replicode]].   |

\\


====Levels of Self-Programming====
|  Level 1  | Level one self-programming capability is the ability of a system to make programs that exclusively make use of its primitive actions from action set.  |
|  Level 2  | Subsumes Level 1; additionally generates new primitives.   |
|  \\ Level 3  | Subsumes Levels 1 and 2; adds the ability to \\ //change the principles by which Level 1 and Level 2 operate//, in other words, \\ Level-3 self-programming systems are capable of what we would here call **meta-programming**. This would involve //changing or replacing some or all of the programs provided to the system at design time//. \\ Of course, the generation of primitives, and the changes of principles, are also controlled by some programs.   |
|  Infinite regress?  | Though the process of self-programming can be carried out in more than one level, eventually the regress will stop at a certain level. The more levels are involved, the more flexible the system will be, though at the same time it will be less stable and more complicated to be analyzed.   |
|  Likely to be many ways?  | For AGI the set of relevant self-programming approaches is likely to be a much smaller set than that typically discussed in computer science, and in all likelihood much smaller than often implied in AGI.    |
|  \\ Architecture  | The possible solutions for effective and efficient self-programming are likely to be strongly linked to what we generally think of as the //architectural structure// of AI systems, since self-programming for AGI may fundamentally have to change, modify, or partly duplicate, some aspect of the architecture of the system, for the purpose of being better equipped to perform some task or set of tasks.   |

\\

====Existing Systems Which Target Self-Programming====
^  Label  ^  What  ^  Example  ^Description^
|  \\ [S]  |  \\ State-space search   |  \\ GPS (Newell et al. 1963)  | The atomic actions are state-changing operators, and a program is represented as a path from the initial state to a final state. Variants of this approach include program search (examples: Gödel Machine (Schmidhuber 2006)): Given the action set A, in principle all programs formed by it can be exhaustively listed and evaluated to find an optimal one according to certain criteria.   |
|  \\ [P]  |  Production system   |  SOAR (Laird 1987)  | Each production rule specifies the condition for a sequence of actions that correspond to a program. Mechanisms that produce new production rules, such as chunking, can be considered self-programming.   |
|  \\ [R]  |  Reinforcement learning  |  AIXI (Hutter 2007)  | When an action of an agent changes the state of the environment, and each state has a reward value associated, a program corresponds to a policy in reinforcement learning. When the state transition function is probabilistic, this becomes a Markov decision process.   |
|  \\ [G]  |  \\ Genetic programming  |  Koza’s Invention Machine (Koza et al. 2000)  | A program is formed from the system’s actions, initially randomly but subsequently via genetic operators over the best performers from prior solutions, possibly by using the output of some actions as input of some other actions. An evolution process provides a utility function that is used to select the best programs, and the process is repeated.   |
|  [I]  |  Inductive logic programming  |  (Muggleton 1994)  | A program is a statement with a procedural interpretation, which can be learned from given positive and negative examples, plus background knowledge.   |
|  [E]  |  Evidential reasoning  |  NARS (Wang 2006)   | A program is a statement with a procedural interpretation, and it can be learned using multi-strategy (ampliative) uncertain reasoning.  |
|  \\ \\ [A]  |  \\  Autocatalytic model-driven bi-directional search  |  \\ AERA (Nivel et al. 2014) \\ & \\ Ikon Flux (Nivel 2007)  | In this context the architecture is in large part comprised of a large collection of models, acting as hierarchically organized controllers, executed through a contextually-informed, continuous auto-catalytic process. New models are produced automatically, based on experience, their quality evaluated in light of this experience, and improvements produced as a result. Self-programming occurs at two levels: The lower one is concerned with performance in a set of domains, making models of how best to achieve goals in the external world at any point in time, the higher level is concerned with the operation of the lower one, implementing integrated cognitive control and meta-learning capabilities. Semantically closed auto-catalytic processes maintain the system’s growth after they are deployed.   |
|  Source | [[http://alumni.media.mit.edu/~kris/ftp/JAGI-Special-Self-Progr-Editorial-ThorissonEtAl-09.pdf|Thórisson, Nivel, Sanz & Wang 2012]]     |||

\\

====Design Assumptions in The Above Approaches====
|  \\ How does the system represent a basic action?  | a) As an operator that transforms a state to another state, either deterministically or probably, and goal as state to be reached [R, S] \\ b) As a function that maps some input arguments to some output arguments [G] \\ c) As a realizable statement with preconditions and consequences [A, E, I, P] \\ Relevant assumptions: \\ Is the knowledge about an action complete and certain? \\ Is the action set discrete and finite?   |
|  \\ Can a program be used as an "action" in other programs?  | a) Yes, programs can be built recursively [A, E, G, I] \\ b) No, a program can only contain basic actions [R, S, P] \\ Relevant assumptions: \\  Do the programs and actions form a hierarchy? \\ Can these recursions have closed loops?  |
|  \\ How does the system represent goals?  | a) As states to be reached [S] \\ b) As values to be optimized [G, R] \\ c) As statements to be realized [E, P, A] \\ d) As functions to be approximated [I]  \\ Relevant assumptions: \\  Is the knowledge about goals complete? \\ Is the knowledge about goals certain? \\ Can all the goals be reached with a concrete action set?   |
|  \\ Are there derived goals?  | a) Yes, and they are logically dependent to the original goals [I, S, P] \\ b) Yes, and they may become logically independent to the original goals [A, E] \\ c) No, all goals are given or innate [G, R] \\ Relevant assumptions: \\  Are the goals constant or variable? \\ Are the goals externally imposed or internally generated?  |
|  \\ Can the system learn new knowledge about actions and goals?  | a) Yes, and the learning process normally converges [G, I, R] \\ b) Yes, and the learning process may not converge [A, E, P] \\ c) No, all the knowledge are given or innate [S] \\ Relevant assumptions: \\  Are the goals constant or variable? \\ Are the actions constant or variable?   |
|  \\ What is the extent of resources demanded?  | a) Unlimited time and/or space [I, R, S, P] \\ b) Limited time and space [A, E, G] \\ Relevant assumption: Are the resources used an attribute of the problem, or of the solution?   |
|  \\ When is the quality of a program evaluated?  | a) After execution, according to its actual contribution [G] \\ b) Before execution, according to its definition or historical record [I, S, P] \\ c) Both of the above [A, E, R] \\ Relevant assumption: \\  Are adaptation and prediction necessary?   |
|  source  | [[http://alumni.media.mit.edu/~kris/ftp/JAGI-Special-Self-Progr-Editorial-ThorissonEtAl-09.pdf|Thórisson et al. 2012]]   |


==== Integrated Cognitive Control ====
|  What it is  | The ability of a controller / cognitive system to steer its own structural development - architectural growth (cognitive growth). The (sub-) system responsible for meta-learning.   |
|  Cognitive Growth  | The structural change resulting from learning in a structurally autonomous cognitive system - the target of which is self-improvement.  |

==== Cognitive Growth ====
|  What it is  | Changes in the cognitive controller (the core "thinking" part) over and beyond basic learning: After a growth burst of this kind the controller can learn differently/better/new things, especially new //categories// of things.   |
|  Human example | [[https://m.youtube.com/watch?v=TRF27F2bn-A|Piaget's Stages of Development (youtube video)]]    |

\\


==== Predictability ====

|  What It Is  | The ability of an outsider to predict the behavior of a controller based on some information.   |
|  Why It Is Important  | Predicting the behavior of (semi-) autonomous machines is important if we want to ensure their safe operation, or be sure that they do what we want them to do.    |
|  \\ How To Do It  | Predicting the future behavior of ANNs (of any kind) is easier if we switch off their learning after they have been trained, because there exists no method for predicting where their development will lead them if they continue to learn after the leave the lab. Predicting ANN behavior on novel input can be done statistically, but there is no way to be sure that novel input will not completely reverse their behavior. There are very few if any methods for giving ANNs the ability to judge the "novelty" of any input, which might to some extent possibly help with this issue. Reinforcement learning addresses this by only scaling to a handful of variables with known max and min.  |
\\

====Reliability====

|  What It Is  | The ability of a machine to always return the same - or similar - answer to the same input.   |
|  Why It Is Important  | Simple machine learning algorithms are very good in this respect, delivering high reliability. Human-level AI, on the other hand, may have the same limitations as humans in this respect, i.e. not being able to give any guarantees.   |
|  Human-Level AI  | To make human-level AI reliable is important because a human-level AI without reliability cannot be trusted, and hence would defeat most of the purpose for creating it in the first place. (AERA proposes a method for this - through continuous pee-wee model generation and refinement.)   |
|  To Achieve Reliability  | Requires **predictability**. Predictability requires sorting out //causal relations// (without these we can never be sure what lead to what).   |
|  Predictability is Hard to Achieve  | In a growing, developing system that is adapting and learning (3 or 4 levels of dynamics!) achieving predictability can only be achieved by **abstraction**: Going to the next level of detail (e.g. I cannot be sure //what exactly// I will eat for dinner, but I can be pretty sure that I //will// eat dinner).    |
|  Achieving Abstraction  | Can be done through hierarchy (but it needs to be //dynamic// - i.e. tailored to its intended usage, as the circumstances call for - because the world has too complex combinatorics to store precomputed hierarchies for everything).   |
\\


\\


\\
\\
\\
------------

2020(c)K. R. Thórisson