Table of Contents

T-720-ATAI-2019 Main
Links to Lecture Notes

T-720-ATAI-2019

Lecture Notes: Requirements for AI & AGI Systems




Requirements for Intelligent Learning Systems

“Intelligent System”:
Expectation
We expect an “intelligent” system to be able to learn.
Standard Learning Expectation That the system can learn a task.
Examples of “Intelligent” Systems Deep Blue. Watson. Alpha Go.
What these systems have in common They can only learn (do) one task.
They are really bad at learning temporal tasks.
Their learning must be turned off when they leave the lab.
The tasks they learn are relatively simple (in that their goal structure can be easily formalized).
They are neither “domain-independent” nor “general” - they are not general learners.
We want more general learners A general learner would not be limited by domain, topic, task-environment, or other such limitations - the more free from such constraints, the more “intelligent” the system.





What Do You Mean by "Generality"?

Flexibility:
Breadth of task-environments
If a system X can operate in more diverse task-environments than system Y, system X is more flexible than system Y.
Solution Diversity:
Breadth of solutions
If a system X can reliably generate a larger variation of acceptable solutions to problems than system Y, system X is more powerful than system Y.
Constraint Diversity:
Breadth of constraints on solutions
If a system X can reliably produce acceptable solutions under a higher number of solution constraints than system Y, system X is more powerful than system Y.
Goal Diversity:
Breadth of goals
If a system X can meet a wider range of goals than system Y, system X is more powerful than system Y.
Generality Any system X that exceeds system Y on one or more of the above we say it's more general than system Y, but typically pushing for increased generality means pushing on all of the above dimensions.
General intelligence… …means less is needed to be known up front when the system is created, the system can learn to figure things out and how to handle itself, in light of LTE.
And yet:
The hallmark of an AGI
A system that can handle novel or brand-new open problems. The level of difficulty of the problems it solves would indicate its generality.



After it Leaves the Lab
A: Simple machine learners (<m>L_0</m>) take a small set of inputs (<m>x, y, z</m>) and make a choice between a set of possible outputs (<m>α,β</m>), as specified in detail by the system’s designer. Increasing either the set of inputs or number of possible outputs will either break the algorithm or slow learning to impractical levels.
B: Let <m>tsk_i</m> refer to relatively non-trivial tasks such as assembling furniture and moving office items from one room to another, <m>S_i</m> to various situations that a family of tasks can be performed, and <m>e_i</m> to environments where those situations may be encountered. Simple learner <m>L_0</m> is limited to only a fraction of the various things that must be learned to achieve such a task. Being able to handle a single such task in a particular type of situation (<m>S_1</m>) with features that were unknown prior to the system’s deployment, <m>L_1</m> is already more capable than most if not all automatic learning systems available today. <m>L_2</m>, <m>L_3</m> and <m>L_4</m> take successive steps up the complexity ladder beyond that, being able to learn numerous complex tasks (<m>L_2</m>), in various situations (<m>L_3</m>), and in a wider range of environments and mission spaces (<m>L_4</m>). Only towards the higher end of this ladder can we hope to approach really general intelligence – systems capable of learning to effectively and efficiently perform multiple a-priori unfamiliar tasks, in multiple a-priori unfamiliar situations, in multiple a-priori unfamiliar environments, on their own.



Requirements For AGI Systems

KeyWhat it MeansWhy it's Important
Mission R1. The system must fulfill its mission – the goals and constraints it has been given by its designers – with possibly several different priorities. This is the very reason we built the system. We should have pretty good ideas as to why. Shared by all AI systems.
AILL “After it Leaves the Lab” R2. The system must be designed to be operational in the long-term, without intervention of its designers after it leaves the lab, as dictated by the temporal scope of its mission. All machine learning methods today are “before it leaves the lab”, meaning that the task-environment must be known and clearly delineated beforehand, and the system cannot handle changes to these assumptions. To be more autonomous we must look at the life of these systems “beyond the lab”.
Domain-independence R3. The system must be domain- and task-independent – but without a strict requirement for determinism: We limit our architecture to handle only missions for which rigorous determinism is not a requirement. It is easy to implement domain dependence in software systems: Virtually all software today is made this way. Domain independence is necessary if we want to build more autonomous systems.
Modeling R4. The system must be able to model its environment to adapt to changes thereof. A good controller not only reacts to changes in its environment, it anticipates them. Anticipation, or prediction, is only possible with a decent model the system whose behavior we are predicting. A good model allows detailed and long-term prediction.
Anytime R5. As with learning, planning must be performed continuously, incrementally and in real-time. Pursuing goals and predicting must be done concurrently. A good system learns all the time and is planning and revising its plans all the time. Anything less makes the system less fit (“dumber”).
Attention R6. The system must be able to control the focus of its attention. Any system in a world that is vastly more complex and large than its resources allow to explore at any one time, must select what to apply its thinking, memory, and behavior to. Such “resource management” when applied to thinking is called “attention”.
Self-Modeling R7. The system must be able to model itself. Any cognitive growth (development) requires comparing or evaluating a new state or architecture of the system to an old one. Unless the system has a model of self such self-modification cannot be evaluated a priori, and all changes are random explorations, which is the most inefficient method to apply to goal-directed behavior, and certainly not “intelligent” in any way.
No Certainty R8. The system must be able to handle incompleteness, uncertainty, and inconsistency, both in state space and in time. In any large world there will be unintended and unforeseen consequences to all changes, as well as potential errors in measurements (perception). Certainty can never be 1.
In other words, “Nothing is 100% (not even this axiom!).”
Abstractions R9. The system must be able to generate abstractions from learned knowledge. Abstractions are a kind of compression that allows more efficient management of small details, causal chains, etc. Abstraction is fundamental to induction (generalization) and analogies, two cognitive skills of critical importance in human intelligence.
Reasoning R10. The system must be able to use applied logic - reasoning - to generate, manipulate, and use its knowledge. Reasoning in humans is not the same as reasoning in formal logics; it is non-axiomatic and is always performed under uncertainty (per R8).



Some Features Under Consideration

Attention /
Self-Control
The management of processing, memory, and sensory resources.
Meta-Cognition The ability of a system to evaluate itself and reason about itself.
Reasoning The application of logical rules to knowledge: Abduction (“How could this have come to be?”), induction (“This might imply a general rule that can be applied to all Xs”), and deduction (“If all Xs are Ys, and Z is an X, then Z must be a Y”).
Creativity A measure for the ability of a system to deal with novelty, especially to produce novel solutions to known problems, but also to identify novel problems.
Imagination The ability to handle what-if scenarios, ideas, concepts, questions, etc., especially novel non-existent ones.
Understanding The ability to provide correct solutions to questions about a phenomenon, esp. counterfactuals: What if X had not happened? The deeper the understanding of X is, the larger the set of what-if questions that can be answered correctly about X.
Explanation The ability to (re-)formulate knowledge in a format that can be understood by others.
Learning Acquisition of knowledge that enables more successful completion of tasks and adaptation to environments.
Life-long learning Incremental acquisition of knowledge throughout a (non-trivially long) lifetime.
Cumulative Learning The ability to integrate new information with that already acquired, in a coherent, efficient and effective manner (seeing what relates to what, resolving conflicts).
Transfer learning The ability to transfer what has been learned in one task to another.
Autonomy The ability to do tasks without interference / help from others.



Laird et al.: Requirements for AGI-Aspiring Cognitive Architectures

Reference In Cognitive Architecture Requirements for Achieving AGI (J.E. Laird et al.) the authors list a number of system features that would be required for a system to claim general intelligence. While useful, some items on this list translate to more general concepts as explained in this column.
RequirementExplanationRewritten As
R0. FIXED STRUCTURE FOR ALL TASKS The authors mean that the architecture of the system should be fixed, because the system should adapt to environments, solve problems, and perform tasks through acquiring knowledge, rather than changing its architecture. The problem here: Cognitive architecture of humans likely changes when we develop from children to adults, so this would seem too strict a requirement. Any controller will have a central core whose architecture either stays fixed, or changes more slowly than the rest of the system, to ensure stability. Said another way, only parts of a controller can be allowed to change at any time, to ensure stability.
R1. REALIZE A SYMBOL SYSTEM A symbol system is one way in which to implement compression, for purposes of communication, transformation, and flexible manipulation. It is not necessary, strictly speaking, unless it can be shown that symbol systems implement compression in a way that is unique and necessary for intelligence. This requirement shall be replaced by the acquistition/creation of models, where the models are (a) fine-grain and (b) form structures and hierarchies that can be manipulated in toto as symbols.
R2. REPRESENT AND EFFECTIVELY USE MODALITY-SPECIFIC KNOWLEDGE This means that knowledge created from information from a single sensory modality (e.g. hearing) can be used without input from other modalities to create knowledge. A more general form of this requirement is that any sensor combination, or subset of sensors, including single sensors, may be the source of information used to create knowledge.
R3. REPRESENT AND EFFECTIVELY USE LARGE BODIES OF DIVERSE KNOWLEDGE As the uniformness of knowledge gathered by a cognitive system cannot be guaranteed, increased efficiency can be achieved through such a principle. This is generally agreed to in the A(G)I community.
R4. REPRESENT AND EFFECTIVELY USE KNOWLEDGE WITH DIFFERENT LEVELS OF GENERALITY Knowledge acquired by experience at different occasions and on different topics may generalize to different extents. This is generally agreed to in the A(G)I community.
R5. REPRESENT AND EFFECTIVELY USE DIVERSE LEVELS OF KNOWLEDGE Most of the time, for an agent in a complex world, information available to help the agent reach any goal or do any task is incomplete, incorrect, or absent. The agent should be able to use whatever is available to it, By “levels” here is meant different amounts of imperfection, e.g. correctness and completeness. This is not often discussed in the AI community, more so in the AGI community (but still not enough!). It follows directly from assumptions about incremental acquisition of knowledge (from experience). Pei Wang captured this nicely in his AIKR acronym (assumption of insufficient knowledge and resources).
R6. REPRESENT AND EFFECTIVELY USE BELIEFS INDEPENDENT OF CURRENT PERCEPTION This follows deductively if we assume that a system learns from experience. This is generally agreed to in the A(G)I community.
R7. REPRESENT AND EFFECTIVELY USE RICH, HIERARCHICAL CONTROL KNOWLEDGE Control must be possible at various levels of detail, both spatial and temporal. This topic has seen very little discussion in the AI literature.
R8. REPRESENT AND EFFECTIVELY META-COGNITIVE KNOWLEDGE Knowledge about knowledge is necessary if we wish to reason about how we acquire knowledge, for instance to improve our acquisition of knowledge (learning). What exactly counts as “meta-knowledge” varies from author to author, but many argue that some form of meta-knowledge is necessary for AGI systems. In any case, it is difficult to see how a system could implement cognitive development (the acquisition or development of new cognitive faculties) without it.
R9. SUPPORT A SPECTRUM OF BOUNDED AND UNBOUNDED DELIBERATION Bounded deliberation means reasoning (or “thinking”) that has limits (in time and/or energy). It is not clear whether “unbounded deliberation” actually exists, since all non-hypothetical cognitive agents are bound by the physics of the material world.
R10.SUPPORT DIVERSE, COMPREHENSIVE LEARNING No explanation required. This is generally agreed to in the AI community.
R11.SUPPORT INCREMENTAL, ONLINE LEARNING If an AGI is supposed to handle unknown (at design time) task-environments then it must be able to learn as it goes. Surprisingly few papers on this subject are to be found in the AGI community; multi-objective learning (a branch of optimization research) has devoted some conferences to this subject.





2019©K. R. Thórisson

EOF