[[http://cadia.ru.is/wiki/public:t-720-atai:atai-19:main|T-720-ATAI-2019 Main]] \\ [[http://cadia.ru.is/wiki/public:t-720-atai:atai-19:Lecture_Notes|Links to Lecture Notes]] =====T-720-ATAI-2019===== ==== Lecture Notes: AI Methodologies==== \\ \\ \\ \\ ====Complex Systems==== ^Kind of System^What it Consists of^Theory^Methodology^ | \\ Static system | Elements of the system do not interact, or interact slowly. Example: Mountains. | Depends on the domain. | Ditto | | Dynamic system | Elements of the system interact. | Depends on the domain. | Ditto | | \\ Simple system | \\ Few interacting parts | \\ Mechanics | Observation of operation, experimentation, parts analysis. Analytic analysis. | | Complex uniform system | Vast amounts of identical elements interacting | Thermodynamics | Statistics. Mathematical models and simulation. | | Complex heterogeneous system | Multiple unique elements interacting | //--missing!--// | Agent-based models and simulation | \\ \\ ====Minds Are Complex Systems==== | So You Want To Create Human-Level AI? | If we want to make smarter machines we should begin by isolating the requirements. Requirements for AGI were already covered - that's a long list (but doable, we hope!). Are there other concerns? \\ Yes: We need to pick an appropriate methodology for this task. | | Mind: What Kind of System? | Mind contains a large number of processes (large size), of a wide variety implementing a variety of complex functions (heterogeneity), that work closely together in an integrated manner to implement global emergence (dense coupling); it is a //heterogeneous large densely coupled system// - a //HeLD//. | | HeLDs | HeLDs are found everywhere in nature: Ecosystem, forests, societies, traffic, economies, and yes - minds. \\ In fact, minds are a special kind of //self-governed// HeLD. | | How to Study HeLDs | HeLDs require special consideration of methodology - good old fashioned run-of-the-mill reductionism may not suffice. | \\ \\ ====Methodology: What It Is==== | What it is | The methods (tools and techniques) we use to study a phenomenon. | | Examples | - Comparative experiments (for the answers we want Nature to ultimately give). \\ - Telescopes (for things far away). \\ - Microscopes (for all things smaller than the human eye can see unaided). \\ - Simulations (for complex interconnected systems that are hard to untangle). | | Why it's important | Methodology directly determines how we may study a phenomenon -- what we can do with respect to that phenomenon when scrutinizing it. \\ Methodology affects how we think about a phenomenon, including our solutions, expectations, and imagination. \\ Methodology determines the possible scope of outcomes. \\ Methodology directly influences the shape of our solution - our answers to scientific questions. \\ Methodology directly determines the speed with which we can make progress when studying a phenomenon. \\ //Methodology is therefore a primary determinant of scientific progress.// | | The main AI methodology | AI never really had a proper methodology discussion as part of its mainstream scientific discourse. Only 2 or 3 approaches to AI can be properly called 'methodologies': //BDI// (belief, desire, intention), //subsumption//, //decision theory//. As a result AI inherited the run of the mill CS methodology/ies by default. | | Constructi//on//ist AI | Methods used to build AI systems by hand. | | Constructi//v//ist AI | Methods aimed at creating AI systems that autonomously generate, manage, and use their knowledge. | \\ \\ ====Examples of Self-Organization (Emergence)==== | Group Clapping | Try this in class: Start clapping; aim for clapping at the same rate as everyone else in the room. Self-organization without centralized control. | | Conway's Game of Life | [[https://bitstorm.org/gameoflife/|Link to Applet]] | \\ \\ \\ \\ ====Belusov-Zhaboutinsky Reaction==== | {{public:t-720-atai:250px-the_belousov-zhabotinsky_reaction.gif}} | | Simulated Belousov-Zhabotinsky Reaction. [[https://en.wikipedia.org/wiki/Belousov–Zhabotinsky_reaction|Source: Wikipedia]] | \\ \\ ====Belusov-Zhaboutinsky Reaction==== | What it is | A chemical reaction discovered in 1950. | | Why it's important | Great visual example of the kind of emergent patterns can be created through auto-catalysis (chemical in this case). One of the first (the first?) scientifically published example of emergence identified as such. | | Real version on Youtube | https://www.youtube.com/watch?v=IBa4kgXI4Cg \\ https://www.youtube.com/watch?v=3JAqrRnKFHo \\ https://www.youtube.com/watch?v=4y3uL5PRsZw&feature=related | \\ \\ ====How the Belusov-Zhaboutinsky Reaction Works==== | {{public:t-720-atai:zhabotinsky-reaction-1.png?400|Belousov-Zhabotinsky Reaction}} | | A Belousov–Zhabotinsky reaction, or BZ reaction, is one of a class of reactions that serve as a classical example of non-equilibrium thermodynamics, resulting in the establishment of a nonlinear chemical oscillator. [[https://en.wikipedia.org/wiki/Belousov–Zhabotinsky_reaction|Wikipedia]] | \\ \\ \\ \\ ====Cellular Automata==== | What it is | An algorithmic way to program interaction between (large numbers of) rule-determined "agents" or cells. [[https://en.wikipedia.org/wiki/Cellular_automaton|Wikipedia]] | | Why it's important | Powerful method to explore the concept of emergence. Also used for simulating the evolution of complex systems. | | Explicates | Interaction of rules. | | Typical manifestation | 1D or 2D grid with cell behavior governed by rules of interaction. Each cell has a scope of what it "sees" (its range of "causal ties"). | \\ \\ \\ \\ ====CA Example 1==== | {{public:t-720-atai:emergence-fig.jpg}} | | In this example | | **Green --> Brown IF one or more are //true://** \\ * There are more than 20 green patches around and lifetime exceeds 30 \\ * There are less than 12 green patches around and lifetime exceeds 20 \\ * The number of surrounding green patches > 25 \\ * Lifetime > 60 ticks \\ **Brown --> Green IF both are //true//:** \\ * Number of surrounding green patches > 8 and heir lifetime combined > 80 \\ * Number of surrounding brown patches > 10 | \\ \\ \\ \\ ==== Stephen Wolfram's CA Work==== | CA | http://mathworld.wolfram.com/CellularAutomaton.html | | Book | A New Kind of Science. | | Why it's important | Major analysis of rules for 1-D CAs. Most comprehensive work on CAs to date. | | Rule 30 | [[https://en.wikipedia.org/wiki/Rule_30|Wikipedia]] | \\ \\ \\ \\ \\ \\ ====How to Study HeLDs Scientifically==== | HeLDs | Heterogeneous, large, densely-coupled systems. | | Reductionism | The method of isolating parts of a complex phenomenon or system in order to simplify and speed up our understanding of it. See also [[https://en.wikipedia.org/wiki/Reductionism|Reductionism]] on Wikipedia. | | Occam's Razor | Key principle of reductionism. See also [[https://en.wikipedia.org/wiki/Occam%27s_razor|Occam's Razor]]. | | \\ HeLD | Cannot be studied by the standard application of reductionism/Occam's Razor, because the emergent properties are lost. Instead, corollaries of the system -- while ensuring some commonality to the original system //in toto// -- must be studied to gain insights into the target system. | | Agent & Environment | We try to characterize the agent and its task-environment as two interacting complex systems. If we keep the task-environment constant, the remaining system to study is the agent and its controller. | \\ \\ | {{public:t-720-atai:simple-system1.png}} | | How to tease apart HeLDs. | \\ \\ | {{public:t-720-atai:system-env-world-1.png}} | | Relationship between system, its task-environment, and the world. Task-environments will always inherit the "laws" of the world; the world puts constraints on the state-space of the task-environment. | \\ \\ \\ \\ ====Constructionist Methodologies: Traditional CS Software Development Methods==== | What it is | A constructionist methodology requires an intelligent designer that manually (or via scripts) arranges selected //components// that together makes up a //system of parts// (read: architecture) that can act in particular ways. Examples: automobiles, telephone networks, computers, operating systems, the Internet, mobile phones, apps, etc. | | Why it's important | Virtually all methodologies we have for creating software are of this kind. | | Fundamental CS methodology | On the theory side, for the most part mathematical methodologies (not natural science). On the practical side, hand-coding programs and manual invention and implementation of algorithms. Systems creation in CS is "co-owned" by the field of engineering. | | The main methodology/ies in CS | \\ Constructionist. | \\ \\ \\ \\ ====Constructionist AI==== | Constructionist AI | Methodology for building //cognitive agents// that relies primarily on Constructionist methodologies. | | What it is | Refers to AI system development methodologies that require an intelligent designer -- the software programmer as "construction worker". | | Why it's important | All traditional software development methodologies, and by extension all traditional AI methodologies, are constructionist methodologies. | | \\ What it's good for | Works well for constructing //controllers// of Closed Problems where (a) the Solution Space can be defined fully or largely before the controller is constructed, (b) there exist clearly definable Goal hierarchies and measurements that, when used, fully implement the main purpose of the AI system, and ( c) the Task assigned to the controller will not change throughout its lifetime (i.e. the controller does not have to generate novel sub-Goals). | | Key Implementation Method | Hand-coding using programming language and methods created to be used by human-level intelligences. | \\ \\ \\ \\ ==== Example Constructionist Methodology: Subsumption Architecture ==== | Augmented Finite State Machines (AFSMs) | Finite State Machines, augmented with timers. | | Modules (FSMs) have internal state | The internal state includes: \\ * the clock \\ * the inputs (no history) \\ * the (current) output (no history) \\ * may include "activation level" | | External environment constists of connections ("wires") | * Input \\ * Inhibitor \\ * Suppressor \\ * Reset \\ * Output | | Augmented Finite State Machine (AFSM) with connections | {{/public:t-720-atai:subsumption-arch-module-1.gif}} | | | Suppressor: Replaces the input to the module \\ Inhibitor: Stops the output for a given period \\ Reset: Initialization puts the module in its original state | | Augmentation | The finite state machines are augmented with timers. \\ The time is fixed for each I or R, per module. | | Timers | Timers enable modules to behave autonomously based on a (relative) time | | The AFSMs are arranged in "layers" | Layers separate functional parts of the architecture from each other | | | | || {{/public:t-720-atai:subsumption-arch-2.jpg?700}} || || Example subsumption architecture with layers. || \\ \\ \\ \\ ====Key Limitations of Constructionist Methodologies==== | Static | System components that are fairly static. Manual construction limits the complexity that can be built into each component. | | Size | The sheer number of components that can form a single architecture is limited by what a designer or team can handle. | | Scaling | The components and their interconnections in the architecture are managed by algorithms that are hand-crafted themselves, and thus also of limited flexibility. | | Result | Together these three problems remove hopes of autonomous architectural adaptation and system growth. | | Conclusion | In the context of artificial intelligences that can handle highly **//novel//** Tasks, Problems, Situations, Environments and Worlds, no constructionist methodology will suffice. | | Key Problem | Reliance on hand-coding using programming methods requiring human-level intelligence. So the system cannot program itself. \\ Another way to say it: Strong requirement of an outside designer. | | Contrast with | Constructivist AI | \\ \\ \\ \\ ====Constructivist AI Methodology (CAIM) ==== | What it is | A term for labeling a methodology for AGI based on two main assumptions: (1) The way knowledge is acquired by systems with general intelligence requires the automatic integration, management, and revision of data in a way that //infuses meaning// into information structures, and (2) constructionist approaches do not sufficiently address this, and other issues of key importance for systems with high levels of general intelligence and existential autonomy. || | \\ Why We Need It | Most AI methodology to date has automatically inherited all standard software methodological principles. This approach assumes that software architectures are hand-coded and that (the majority of) the system's knowledge and skills is hand-fed to it. In sharp contrast, CAIM assumes that the system acquires the vast majority of its knowledge on its own (except for a small seed) and manages its own GROWTH on its own. Also, it may change its own architecture over time, due to experience and learning. || | Why it's important | It is the first and only attempt so far at explicitly proposing an alternative to current methodologies and prevailing paradigm, used throughout AI and computer science. || | What it's good for | Replacing present methods in AI, by and large, as these will not suffice for addressing the full scope of the phenomenon of intelligence, as seen in nature. || | What It Must Do | We are looking for more than a linear increase in the power of our systems to operate reliably, and in a variety of (unforeseen, novel) circumstances. The methodology should help meet that requirement. || | Basic tenet | That an AGI must be able to handle //new// Problems in //new// Task-Environments, and to do so it must be able to create //new// knowledge with //new// Goals (and sub-goals), and to do so their architecture must support automatic generation of //meaning//, and that constructionist methodologies do not support the creation of such system architectures. || | Roots | Piaget | proposed the //constructivist// view of human knowledge acquisition, which states (roughly speaking) that a cognitive agent (i.e. humans) generate their own knowledge through experience. | | | von Glasersfeld | "...‘empirical teleology’ ... is based on the empirical fact that human subjects abstract ‘efficient’ causal connections from their experience and formulate them as rules which can be projected into the future." [[http://www.univie.ac.at/constructivism/EvG/papers/225.pdf|REF]] \\ CAIM was developed in tandem with this architecture/architectural blueprint. | | Architectures built using CAIM | AERA | Autocatalytic, Endogenous, Reflective Architecture [[http://cadia.ru.is/wiki/_media/public:publications:aera-rutr-scs13002.pdf|REF]] \\ Built before CAIM emerged, but based on many of the assumptions consolidated in CAIM. | | | NARS | Non-Axiomatic Reasoning System [[https://sites.google.com/site/narswang/|REF]] \\ //“If the existing domain-specific AI techniques are seen as tools, each of which is designed to solve a special problem, then to get a general-purpose intelligent system, it is not enough to put these tools into a toolbox. What we need here is a hand. To build an integrated system that is self-consistent, it is crucial to build the system around a general and flexible core, as the hand that uses the tools [assuming] different forms and shapes.”// -- Wang, 2004 | | Limitations | As a young methodology very little hard data is available to its effectiveness. What does exist, however, is more promising than constructionist methodologies for achieving AGI. || \\ \\ \\ \\ ==== Constructivist AI ==== | Foundation | Constructivist AI is concerned with the operational characteristics that the system we aim to build – the AGI architecture – must have. | | \\ \\ Behavioral Characteristics | Refer back to the requirements for AGI systems; it must be able to: \\ - handle novel task-environments. \\ - handle a wide range of task-environments (in the same system, and be able to switch / mix-and-match. \\ - transfer knowledge between task-environmets. \\ - perform reasoning: induction, deduction and abduction. \\ - handle realtime, dynamic worlds. \\ - introspect. \\ - .... and more. | | Constructivist AI: No particular architecture | Constructivist AI does not rest on, and does not need to rest on, assumptions about the particular //kind of architecture// that exists in the human and animal mind. We assume that many kinds of architectures can achieve the above AGI requirements. | \\ ====Architectural Principles of AGI Systems / CAIM==== | Self-Construction | It is assumed that a system must amass the vast majority of its knowledge autonomously. This is partly due to the fact that it is (practically) impossible for any human or team(s) of humans to construct by hand the knowledge needed for an AGI system, and even if this were possible it would still leave unanswered the question of how the system will acquire knowledge of truly novel things, which we consider a fundamental requirement for a system to be called an AGI system. | | Baby Machines | To some extent an AGI capable of growing throughout its lifetime will be what may be called a "baby machine", because relative to later stages in life, such a machine will initially seem "baby like". \\ While the mechanisms constituting an autonomous learning baby machine may not be complex compared to a "fully grown" cognitive system, they are nevetheless likely to result in what will seem large in comparison to the AI systems built today, though this perceived size may stem from the complexity of the mechanisms and their interactions, rather than the sheer number of lines of code. | | Semiotic Opaqueness | No communication between two agents / components in a system can take place unless they share a common language, or encoding-decoding principles. Without this they are semantically opaque to each other. Without communication, no coordination can take place. | | Systems Engineering | Due to the complexity of building a large system (picture, e.g. an airplane), a clear and concise bookkeeping of each part, and which parts it interacts with, must be kept so as to ensure the holistic operation of the resulting system. In a (cognitively) growing system in a dynamic world, where the system is auto-generating models of the phenomena that it sees, each which must be tightly integrated yet easily manipulatable and clearly separable, the system must itself ensure the semiotic transparency of its constituents parts. This can only be achieved by automatic mechanisms residing in the system itself, it cannot be ensured manually by a human engineer, or even a large team of them. | | Self-Modeling | To enable cognitive growth, in which the cognitive functions themselves improve with training, can only be supported by a self-modifying mechanism based on self-modeling. If there is no model of self there can be no targeted improvement of existing mechanisms. | | Self-Programming | The system must be able to invent, inspect, compare, integrate, and evaluate architectural structures, in part or in whole. | | Pan-Architectural Pattern Matching | To enable autonomous //holistic integration// the architecture must be capable of comparing (copies of) itself to parts of itself, in part or in whole, whether the comparison is contrasting structure, the effects of time, or some other aspect or characteristics of the architecture. To decide, for instance, if a new attention mechanism is better than the old one, various forms of comparison must be possible. | | The "Golden Screw" | An architecture meeting all of the above principles is not likely to be "based on a key principle" or even two -- it is very likely to involve a whole set of //new// and fundamentally foreign principles that make their realization possible! | \\ \\ ==== Some Key Requirements For a Constructivist AGI Architecture ==== | \\ Tight Integration | A general-purpose system must tightly and finely coordinate a host of skills, including their acquisition, transitions between skills at runtime, how to combine two or more skills, and transfer of learning between them over time at many levels of temporal and topical detail. | | \\ Holistic Integration | The architecture of an AGI cannot be developed in a way where each of the key requirements (see above) is addressed in isolation, or semi-isolation, due to the resulting system's whole-part semiotic opaqueness: When a system learns new things, to see whether it has learned it before, and use it to improve its understanding, it must relate the new knowledge to its old knowledge, something we call **//integration//**. The same mechanisms needed for integration also enable transfer knowledge; it is these same mechanisms that (in humans) are responsible for what is known as "negative transfer of training", where a priorly learned skill makes it //harder// to learn something new (this happens in humans when the new task is //almost// like the old one, but deviates on some points. The more critical these points are in mastering the skill, the worse the negative transfer of training. | | Transversal Functions | The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection. | | \\ \\ Time | Ignoring (general) temporal constraints is not an option if we want AGI. Move over Turing! Time is a semantic property, and the system must be able to understand – and be able to //learn to understand// – time as a real-world phenomenon in relation to its own skills and architectural operation. Time is everywhere, and is different from other resources in that there is a global clock which cannot, for many task-environments, be turned backwards. Energy must also be addressed, but may not be as fundamentally detrimental to ignore as time while we are in the early stages of exploring methods for developing auto-catalytic knowledge acquisition and cognitive growth mechanisms. | | \\ Architecture Based on New Principles | An architecture that is considerably more complex than systems being built in most AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around //speed of execution// but rather around the //nature of the architectural principles of the system and their runtime operation//. | | Predictable Robustness | The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity. | | \\ Graceful Degradation | Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic failure. A programmer can forget to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems that operate in stochastic environments, where perturbations can come in any form at any time. | | Time is Integrated | Time must be a tightly integrated phenomenon in any AGI architecture - managing and understanding time cannot be "retrofitted" to a complex architecture! | \\ \\ ====What Are Methodologies Good For?==== | Applying a Methodology | results in a family of architectures: The methodology "allows" ("sets the stage") for what //should// and //may// be included when we design our architecture. The methodology is the "tool for thinking" about a design space. (Contrast with requirements, which describe the goals and constraints (negative goals)). | | Following a Methodology | results in a particular //architecture//. | | CAIM Relies on Models | CAIM takes Conant & Ashby's proof (that every good controller of a system is a model of that system - the Good X Theorem) seriously, putting //models// at its center \\ This stance was prevalent in the early days of AI (first two decades) but fell into disfavor due to behaviorism (in psychology and AI). | | Example | The Auto-Catalytic Endogenous Reflective Architecture - AERA - is the only architecture to result directly from the application of CAIM. It is //model-based// and //model-driven// (in an even-driven way: The models left-hand terms are matched to situations to determine their relevance at any point in time, when they match their right-hand term is injected into memory - more on this below). | | In Other Words | AERA Models are a way to represent knowledge. \\ But what are models, really, and what might they look like in this context? | \\ \\ \\ \\ ------------ 2019(c)K. R. Thórisson //EOF//