[[public:t-720-atai:atai-21:main|T-720-ATAI-2021 Main]] \\ [[public:t-720-atai:atai-21:Lecture_Notes|Links to Lecture Notes]] \\ \\ ======CONTROL: Intelligence & Generality====== \\ \\ =====Generality===== \\ \\ ====What Do You Mean by "Generality"?==== | Flexibility: \\ Breadth of task-environments | Enumeration of variety. \\ (By 'variety' we mean the discernibly different states that can be sensed and that make a difference to a controller.) \\ If a system X can operate in more diverse task-environments than system Y, system X is more //flexible// than system Y. | | Solution Diversity: \\ Breadth of solutions | \\ If a system X can reliably generate a larger variation of acceptable solutions to problems than system Y, system X is more //powerful// than system Y. | | Constraint Diversity: \\ Breadth of constraints on solutions | \\ If a system X can reliably produce acceptable solutions under a higher number of solution constraints than system Y, system X is more //powerful// than system Y. | | Goal Diversity: \\ Breadth of goals | If a system X can meet a wider range of goals than system Y, system X is more //powerful// than system Y. | | \\ Generality | Any system X that exceeds system Y on one or more of the above we say it's more //general// than system Y, but typically pushing for increased generality means pushing on all of the above dimensions. | | General intelligence... | ...means less is needed to be known up front when the system is created; the system can learn to figure things out and how to handle itself, in light of **LTE**. | | And yet: \\ The hallmark of an AGI | A system that can handle novel or **brand-new** problems, and be expected to attempt to address //open problems// sensibly. \\ The level of difficulty of the problems it solves would indicate its generality. | \\ ====One Task, Many Tasks, One Environment, Many Environments, One Domain, Many Domains==== | {{public:t-720-atai:afteritleavesthelab.png?750|After it Leaves the Lab}} | | **A:** Simple machine learners **(L0)** take a small set of inputs **(x, y, z)** and make a choice between a set of possible outputs **(α,β)**, as specified in detail by the system’s designer. Increasing either the set of inputs or number of possible outputs will either break the algorithm or slow learning to impractical levels. | | **B:** Let **tski** refer to relatively non-trivial tasks such as assembling furniture and moving office items from one room to another, **Si** to various situations that a family of tasks can be performed, and e_i to environments where those situations may be encountered. Simple learner **L0** is limited to only a fraction of the various things that must be learned to achieve such a task. Being able to handle a single such task in a particular type of situation **(S1)** with features that were unknown prior to the system’s deployment, **L1** is already more capable than most if not all autonomous learning systems available today. **L2, L3** and **L4** take successive steps up the complexity ladder beyond that, being able to learn //numerous// complex tasks **(L2)**, in //various situations// **(L3)**, and in a wider range of //environments and mission spaces// **(L4)**. | | Only towards the higher end of this ladder can we hope to approach really //general, autnomous// intelligence – systems capable of learning to effectively and efficiently perform multiple //a-priori unfamiliar// tasks, in //a variety of a-priori unfamiliar situations//, in a variety of //a-priori unfamiliar environments//, //**on their own**//. | \\ \\ ===== Requirements for General Learning ===== \\ \\ ====Minimum Requirements for Intelligent Learning Systems==== | "Intelligent System": \\ Expectation | We expect an "intelligent" system to be able to //learn//. | | Standard Learning Expectation | That the system can learn //a task//. | | Examples of \\ "Intelligent" Systems | Deep Blue. Watson. Alpha Go. | | \\ What these systems \\ have in common | They can only learn (and do) //one task// (one form of one task, to be exact). \\ They are really bad at learning temporal tasks. \\ Their learning must be turned off when they leave the lab. \\ The tasks they learn are relatively simple (in that their goal structure can be easily formalized). \\ They are neither "domain-independent" nor "general" - they are not //general learners//. | | We want more general learners | A general learner would not be limited by domain, topic, task-environment, or other such limitations - the more free from such constraints, the more "intelligent" the system. | \\ ====Requirements For AGI Systems==== ^ Key ^What it Means^Why it's Important^ | \\ Mission | **R1.** The system must fulfill its mission – the goals and constraints it has been given by its designers – with possibly several different priorities. | This is the very reason we built the system. We should have pretty good ideas as to why. Shared by all AI systems (in fact, all engineered systems). | | \\ AiLL \\ "After it Leaves the Lab" | **R2.** The system must be designed to be operational in the long-term, without intervention of its designers after it leaves the lab, as dictated by the temporal scope of its mission. | All machine learning methods today are "before it leaves the lab", meaning that the task-environment must be known and clearly delineated beforehand, and the system cannot handle changes to these assumptions. To be more autonomous we must look at the life of these systems "beyond the lab". | | \\ Domain-independence | **R3.** The system must be domain- and task-independent – but without a strict requirement for determinism: We limit our architecture to handle only missions for which rigorous determinism is not a requirement. | It is easy to implement domain dependence in software systems: Virtually //all// software today is made this way. Domain independence is necessary if we want to build more autonomous systems. | | \\ Modeling | \\ **R4.** The system must be able to model its environment to adapt to changes thereof. | A good controller not only reacts to changes in its environment, it anticipates them. Anticipation, or prediction, is only possible with a decent model the system whose behavior we are predicting. A good model allows detailed and long-term prediction. | | \\ Anytime | **R5.** As with learning, planning must be performed continuously, incrementally and in real-time. Pursuing goals and predicting must be done concurrently. | A good system learns //all the time// and is planning and revising its plans //all the time//. Anything less makes the system less fit ("dumber"). | | \\ Attention | \\ **R6.** The system must be able to control the focus of its attention. | Any system in a world that is vastly more complex and large than its resources allow to explore at any one time, must select what to apply its thinking, memory, and behavior to. Such "resource management" when applied to thinking is called "attention". | | \\ Self-Modeling | \\ **R7.** The system must be able to model itself. | Any cognitive growth (development) requires comparing or evaluating a new state or architecture of the system to an old one. Unless the system has a model of self such self-modification cannot be evaluated a priori, and all changes are random explorations, which is the most inefficient method to apply to goal-directed behavior, and certainly not "intelligent" in any way. | | \\ No Certainty | **R8.** The system must be able to handle incompleteness, uncertainty, and inconsistency, both in state space and in time. | In any large world there will be unintended and unforeseen consequences to all changes, as well as potential errors in measurements (perception). Certainty can never be 1. \\ In other words, "Nothing is 100% (not even this axiom!)." | | \\ Abstractions | \\ **R9.** The system must be able to generate abstractions from learned knowledge. | Abstractions are a kind of compression that allows more efficient management of small details, causal chains, etc. Abstraction is fundamental to induction (generalization) and analogies, two cognitive skills of critical importance in human intelligence. | | \\ Reasoning | **R10.** The system must be able to use applied logic - reasoning - to generate, manipulate, and use its knowledge. | Reasoning in humans is not the same as reasoning in formal logics; it is non-axiomatic and is always performed under uncertainty (per R8). | | Learning | **R11.** The system must be able to learn. | \\ ====Some Key Features of Cognitive Architectures==== | Attention / \\ Self-Control | The management of processing, memory, and sensory resources. \\ Management of time and energy (limited time and energy: LTE). | | Meta-Cognition | The ability of a system to evaluate itself and reason about itself. | | Reasoning | The application of logical rules to knowledge: Abduction ("How could this have come to be?"), induction ("This might imply a general rule that can be applied to all Xs"), and deduction ("If all Xs are Ys, and Z is an X, then Z must be a Y"). | | Creativity | A measure for the ability of a system to deal with novelty, especially to produce novel solutions to known problems, but also to identify novel problems. | | Imagination | The ability to handle what-if scenarios, ideas, concepts, questions, etc., especially novel non-existent ones. | | Understanding | The ability to provide correct solutions to questions about a phenomenon, esp. counterfactuals: What if X had //not// happened? The deeper the understanding of X is, the larger the set of what-if questions that can be answered correctly about X. | | Explanation | The ability to (re-)formulate knowledge in a format that can be understood by others. | | Learning | Acquisition of knowledge that enables more successful completion of tasks and adaptation to environments. | | Life-long learning | Incremental acquisition of knowledge throughout a (non-trivially long) lifetime. | | Cumulative Learning | The ability to unify new information and knowledge that is already acquired, in a coherent, efficient and effective manner (seeing what relates to what, resolving conflicts). | | Transfer learning | The ability to transfer what has been learned in one task, situation, environment or domain to another task, situation, environment or domain. | | Autonomy | The ability to do tasks without interference / help from others. | \\ \\ ====Laird et al.: Requirements for AGI-Aspiring Cognitive Architectures==== | \\ Reference | In [[http://www.atlantis-press.com/php/download_paper.php?id=1900|Cognitive Architecture Requirements for Achieving AGI]] (J.E. Laird et al.) the authors list a number of system features that would be required for a system to claim general intelligence. | While useful, some items on this list translate to more general concepts as explained in this column. | ^Requirement^Explanation^Rewritten As^ | \\ R0. Fixed Structure for All Tasks | The authors mean that the architecture of the system should be fixed, because the system should adapt to environments, solve problems, and perform tasks through //acquiring knowledge//, rather than changing its architecture. The problem here: Cognitive architecture of humans likely changes when we **develop** from children to adults, so this would seem too strict a requirement. | Any controller will have a central core whose architecture either stays fixed, or changes more slowly than the rest of the system, to ensure stability. Said another way, only parts of a controller can be allowed to change at any time, to ensure stability. | | \\ R1. Realize a Symbol System | A symbol system is one way in which to implement compression, for purposes of communication, transformation, and flexible manipulation. It is not necessary, strictly speaking, unless it can be shown that symbol systems implement compression in a way that is unique and necessary for intelligence. | This requirement shall be replaced by the acquistition/creation of models, where the models are (a) fine-grain and (b) form structures and hierarchies that can be manipulated in toto as symbols. | | R2. Represent & Effectively Use Modality-Specific Knowledge | This means that knowledge created from information from a single sensory modality (e.g. hearing) can be used without input from other modalities to create knowledge. | A more general form of this requirement is that any sensor combination, or subset of sensors, including single sensors, may be the source of information used to create knowledge. | | R3. Represent & Effectively Use Large Bodies of Diverse Knowledge | As the uniformness of knowledge gathered by a cognitive system cannot be guaranteed, increased efficiency can be achieved through such a principle. | This is generally agreed to in the A(G)I community. | | R4. Represent & Effectively Use Knowledge With Different Levels of Generality | Knowledge acquired by experience at different occasions and on different topics may generalize to different extents. | This is generally agreed to in the A(G)I community. | | \\ R5. Represent & Effectively Use Diverse Levels of Knowledge | Most of the time, for an agent in a complex world, information available to help the agent reach any goal or do any task is incomplete, incorrect, or absent. The agent should be able to use whatever is available to it, By "levels" here is meant different amounts of imperfection, e.g. correctness and completeness. | This is not often discussed in the AI community, more so in the AGI community (but still not enough!). It follows directly from assumptions about incremental acquisition of knowledge (from experience). Pei Wang captured this nicely in his AIKR acronym (assumption of insufficient knowledge and resources). | | R6. Represent & Effectively Use Beliefs Independent of Current Perception | This follows deductively if we assume that a system learns from experience. | This is generally agreed to in the A(G)I community. | | R7. Represent & Effectively Use Rich Hierarchical Control Knowledge | Control must be possible at various levels of detail, both spatial and temporal. | This topic has seen very little discussion in the AI literature. | | \\ R8. Represent & Effectively Use Meta-Cognitive Knowledge | Knowledge about knowledge is necessary if we wish to reason about how we acquire knowledge, for instance to improve our acquisition of knowledge (learning). | What exactly counts as "meta-knowledge" varies from author to author, but many argue that some form of meta-knowledge is necessary for AGI systems. In any case, it is difficult to see how a system could implement cognitive development (the acquisition or development of new cognitive faculties) without it. | | R9. Support a Spectrum of Bounded & Unbounded Deliberation | 'Bounded deliberation' means reasoning (or "thinking") that has limits (in time and/or energy). | It is not clear whether "unbounded deliberation" actually exists, since all non-hypothetical cognitive agents are bound by the physics of the material world. | | R10. Support Diverse Comprehensive Learning | No explanation required. | This is generally agreed to in the AI community. | | R11. Support Incremental Online Learning | If an AGI is supposed to handle unknown (at design time) task-environments then it must be able to learn as it goes. | Surprisingly few papers on this subject are to be found in the AGI community; multi-objective learning (a branch of optimization research) has devoted some conferences to this subject. | \\ \\ \\ \\ 2025(c)K. R. Thórisson