User Tools

Site Tools


public:t_720_atai:atai-18:lecture_notes_w6

This is an old revision of the document!


T-720-ATAI-2018 Main
Links to Lecture Notes

T-720-ATAI-2018

Lecture Notes, W6: AI Methodologies





Methodology

What it is The methods we use to study a phenomenon.
Why it's important Methodology directly influences the shape of our solution - our answers to scientific questions. For this reason, and equally importantly, it directly determines the speed with which we can make progress.
The main AI methodology AI never really had a proper methodology discussion as part of its mainstream scientific discourse. Only 2 or 3 approaches to AI can be properly called 'methodologies': BDI (belief, desire, intention), subsumption, decision theory. As a result AI inherited the run of the mill CS methodology/ies by default.
Constructionist AI Methods used to build AI systems by hand.





Software Architectures

What it is In CS: the organization of the software that implements a system.
In AI: The total system that has direct and independent control of the behavior of an Agent via its sensors and effectors.
Why it's important The system architecture determines what kind of information processing can be done, and what the system as a whole is capable of in a particular Task-Environemnt.
Key concepts process types; process initiation; information storage; information flow.
Graph representation Common way to represent processes as nodes, information flow as edges.
Relation to AI The term “system” not only includes not only the processing components, the functions these implement, their input and output, and relationships, but also temporal aspects of the system's behavior as a whole. This is important in AI because any controller of an Agent is supposed to control it in such a way that its behavior can be classified as being “intelligent”. But what are the necessary and sufficient components of that behavior set?
Intelligence is in part a systemic phenomenon Thought experiment: Take any system we deem intelligent, e.g. a 10-year old human, and isolate any of his/her skills and features. A machine that implements any single one of these is unlikely to seem worthy of being called “intelligent” (viz chess programs), without further qualification (e.g. “a limited expert in a sub-field”).





CS Architecture Building Blocks

Pipes & filters Extension of functions.
Component: Each component has a set of inputs and a set of outputs. A component reads streams of data on its inputs and produces streams of data on its outputs, delivering a complete instance of the result in a standard order.
Pipes: Connectors in a system of such components transmit outputs of one filter to inputs of others.
Object-orientation Abstract compound data types with associated operations.
Event-based invocation Pre-defined event types trigger particular computation sequences in pre-defined ways.
Layered systems System is deliberately separated into layers, a layer being a grouping of one or more sub-functions.
Hierarchical systems System is deliberately organized into a hierarchy, where the position in the hierarchy represents one or more important (key system) parameters.
Blackboards System employs a common data store, accessible by more than a single a sub-process of the system (often all).
Hybrid architectures Take two or more of the above and mix together to suit your tastes.



Basic Network Topologies

Point-to-Point Dedicated connection between nodes, shared only by a node at each end.
Bus A message medium, shared by all nodes (or a subset).
Star Central node serves as a conduit, forwarding to others; full structure of nodes forming a kind of star.
Ring All nodes are connected to only two other nodes.
Mesh All nodes are connected to all other nodes (fully connected graph); can be relaxed to partially-connected graph.
Tree Node connections forms a hierarchical tree structure.
Pub-Sub In publish-subscribe architectures one or more “post offices” receive requests to get certain information from nodes.
reference Network topology on Wikipedia





Types of Coordination Hierarchies

A functional hierarchy organizes the execution of tasks according to their functions. A product hierarchy organizes production in little units, each focused on a particular product.
Several types of markets exist - here two idealized versions are show, without and with brokers. De-centralized markets require more intelligence to be present in the nodes, which can be aleviated by brokers. Brokers, however, present weak points in the system: If you have a system with only 2 brokers mediating between processors and consumers/buyers, failure in these 2 points will render the system useless.
Notice that in a basic program written in C++ every single character is such a potential point of failure, which is why bugs are so common in standard software.





Constructionist Methodologies: Traditional CS Software Development Methods

What it is A constructionist methodology requires an intelligent designer that manually (or via scripts) arranges selected components that together makes up a system of parts (read: architecture) that can act in particular ways. Examples: automobiles, telephone networks, computers, operating systems, the Internet, mobile phones, apps, etc.
Why it's important Virtually all methodologies we have for creating software are of this kind.
Fundamental CS methodology On the theory side, for the most part mathematical methodologies (not natural science). On the practical side, hand-coding programs and manual invention and implementation of algorithms. Systems creation in CS is “co-owned” by the field of engineering.
The main methodology/ies in CS Constructionist.





Constructionist AI

Constructionist AI Methodology for building cognitive agents that relies primarily on Constructionist methodologies.
What it is Refers to AI system development methodologies that require an intelligent designer – the software programmer as “construction worker”.
Why it's important All traditional software development methodologies, and by extension all traditional AI methodologies, are constructionist methodologies.
What it's good for Works well for constructing controllers of Closed Problems where (a) the Solution Space can be defined fully or largely before the controller is constructed, (b) there clearly definable Goal hierarchies and measurements that, when used, fully implement the main purpose of the AI system, and ( c) the Task assigned to the controller will not change throughout its lifetime (i.e. the controller does not have to generate novel sub-Goals).
Key Implementation Method Hand-coding using programming language and methods created to be used by human-level intelligences.





Subsumption Architecture Basics

Augmented Finite State Machines (AFSMs) Finite State Machines, augmented with timers.
Modules (FSMs) have internal state The internal state includes:
* the clock
* the inputs (no history)
* the (current) output (no history)
* may include “activation level”
External environment constists of connections (“wires”) * Input
* Inhibitor
* Suppressor
* Reset
* Output
Augmented Finite State Machine (AFSM) with connections
Suppressor: Replaces the input to the module
Inhibitor: Stops the output for a given period
Reset: Initialization puts the module in its original state
Augmentation The finite state machines are augmented with timers.
The time is fixed for each I or R, per module.
Timers Timers enable modules to behave autonomously based on a (relative) time
The AFSMs are arranged in “layers” Layers separate functional parts of the architecture from each other
subsumption-arch-2.jpg
Example subsumption architecture with layers.





Key Limitations of Constructionist Architectures (Manual Construction/Coding)

Static System components that are fairly static. Manual construction limits the complexity that can be built into each component.
Size The sheer number of components that can form a single architecture is limited by what a designer or team can handle.
Scaling The components and their interconnections in the architecture are managed by algorithms that are hand-crafted themselves, and thus also of limited flexibility.
Result Together these three problems remove hopes of autonomous architectural adaptation and system growth.
Conclusion In the context of artificial intelligences that can handle highly novel Tasks, Problems, Situations, Environments and Worlds, no constructionist methodology will suffice.
Key Problem Reliance on hand-coding using programming methods requiring human-level intelligence. So the system cannot program itself.
Another way to say it: Strong requirement of an outside designer.
Contrast with Constructivist AI





Constructivist AI Methodology (CAIM)

What it is A term for labeling a methodology for AGI based on two main assumptions: (1) The way knowledge is acquired by systems with general intelligence requires the automatic integration, management, and revision of data in a way that infuses meaning into information structures, and (2) constructionist approaches do not sufficiently address this, and other issues of key importance for systems with high levels of general intelligence and existential autonomy.
Why it's important It is the first and only attempt so far at explicitly proposing an alternative to current methodologies and prevailing paradigm, used throughout AI and computer science.
What it's good for Replacing present methods in AI, by and large, as these will not suffice for addressing the full scope of the phenomenon of intelligence, as seen in nature.
What It Must Do We are looking for more than a linear increase in the power of our systems to operate reliably, and in a variety of (unforeseen, novel) circumstances.
Basic tenet That an AGI must be able to handle new Problems in new Task-Environments, and to do so it must be able to create new knowledge with new Goals (and sub-goals), and to do so their architecture must support automatic generation of meaning, and that constructionist methodologies do not support the creation of such system architectures.
Roots Piaget proposed the constructivist view of human knowledge acquisition, which states (roughly speaking) that a cognitive agent (i.e. humans) generate their own knowledge through experience.
von Glasersfeld “…‘empirical teleology’ … is based on the empirical fact that human subjects abstract ‘efficient’ causal connections from their experience and formulate them as rules which can be projected into the future.” REF
CAIM was developed in tandem with this architecture/architectural blueprint.
Architectures built using CAIM AERA Autocatalytic, Endogenous, Reflective Architecture REF
Built before CAIM emerged, but based on many of the assumptions consolidated in CAIM.
NARS Non-Axiomatic Reasoning System REF
Limitations As a young methodology very little hard data is available to its effectiveness. What does exist, however, is more promising than constructionist methodologies for achieving AGI.





Constructivist AI

Foundation Constructivist AI is concerned with the operational characteristics that the system we aim to build – the architecture – must have.
Operational characteristics Refer back to the requirements for AGI systems; it must be able to:
- handle novel task-environments.
- handle a wide range of task-environments (in the same system, and be able to switch / mix-and-match.
- transfer knowledge between task-environmets.
- perform reasoning: induction, deduction and abduction.
- handle realtime, dynamic worlds.
- introspect.
- …. and more.
Constructivist AI: No particular architecture Constructivist AI does not rest on, and does not need to rest on, assumptions about the particular kind of architecture that exists in the human and animal mind. We assume that many kinds of architectures can achieve the above AGI requirements.



Examples of Task-Environments Targeted by Constructivist AI

Diversity Earth offers great diversity. This is in large part why intelligence is even needed at all.
Desert
Ocean floor
Air
Interplanetary travel
The Same System at the Same Time These task-environments should be handled by a single system at a single period in time, without the designers coming anywhere close.
Baby Machines While the mechanisms constituting an autonomous learning “baby” machine may not be complex compared to a “fully grown” cognitive system, they are likely to result in what nevertheless will seem large in comparison to the AI systems built today, though this perceived size may stem from the complexity of the mechanisms and their interactions, rather than the sheer number of lines of code.



Architectural Requirements Supporting a Constructivist Approach

Tight integration A general-purpose system must tightly and finely coordinate a host of skills, including their acquisition, transitions between skills at runtime, how to com- bine two or more skills, and transfer of learning between them over time at many levels of temporal and topical detail.
Transversal functions The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection.
Time Ignoring (general) temporal constraints is not an option if we want AGI. Move over Turing! Time is a semantic property, and the system must be able to understand – and be able to learn to understand – time as a real-world phenomenon in relation to its own skills and architectural operation. Time is everywhere, and is different from other resources in that there is a global clock which cannot, for many task-environments, be turned backwards. Energy must also be addressed, but may not be as fundamentally detrimental to ignore as time while we are in the early stages of exploring methods for developing auto-catalytic knowledge acquisition and cognitive growth mechanisms.
Large architecture An architecture that is considerably larger and more complex than systems being built in AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around speed of execution but around the nature of the architectural principles of the system and their runtime operation.
Predictable Robustness The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity.
Graceful Degradation Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic failure. A programmer can forget to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems that operate in stochastic environments, where perturbations can come in any form at any time.
Time is Integrated Time must be a tightly integrated phenomenon in any AGI architecture - managing and understanding time cannot be “retrofitted” to a complex architecture!



Architectural Principles of AGI Systems / CAIM

Self-Construction It is assumed that a system must amass the vast majority of its knowledge autonomously. This is partly due to the fact that it is (practically) impossible for any human or team(s) of humans to construct by hand the knowledge needed for an AGI system, and even this were possible it would still leave unanswered the question of how the system will acquire knowledge of truly novel things, which we consider a fundamental requirement for a system to be called an AGI system.
Holistic Integration The architecture of an AGI cannot be developed in a way where each of the key requirements (see above) is addressed in isolation, or semi-isolation, due to the resulting system's whole-part semiotic opaqueness: When a system learns new things, to see whether it has learned it before, and use it to improve its understanding, it must relate the new knowledge to its old knowledge, something we call integration. The same mechanisms needed for integration also enable transfer knowledge; it is these same mechanisms that (in humans) are responsible for what is known as “negative transfer of training”, where a priorly learned skill makes it harder to learn something new (this happens in humans when the new task is almost like the old one, but deviates on some points. The more critical these points are in mastering the skill, the worse the negative transfer of training.
Semiotic Opaqueness No communication between two agents / components in a system can take place unless they share a common language, or encoding-decoding principles. Without this they are semantically opaque to each other. Without communication, no coordination can take place.
Systems Engineering Due to the complexity of building a large system (picture, e.g. an airplane), a clear and concise bookkeeping of each part, and which parts it interacts with, must be kept so as to ensure the holistic operation of the resulting system. In a (cognitively) growing system in a dynamic world, where the system is auto-generating models of the phenomena that it sees, each which must be tightly integrated yet easily manipulatable and clearly separable, the system must itself ensure the semiotic transparency of its constituents parts. This can only be achieved by automatic mechanisms residing in the system itself, it cannot be ensured manually by a human engineer, or even a large team of them.
Self-Modeling To enable cognitive growth, in which the cognitive functions themselves improve with training, can only be supported by a self-modifying mechanism based on self-modeling. If there is no model of self there can be no targeted improvement of existing mechanisms.
Autonomous Architecting & Implementation The system must be able to invent, inspect, compare, integrate, and evaluate architectural structures, in part or in whole.
Pan-Architectural Pattern Matching To enable autonomous holistic integration the architecture must be capable of comparing (copies of) itself to parts of itself, in part or in whole, whether the comparison is contrasting structure, the effects of time, or some other aspect or characteristics of the architecture. To decide, for instance, if a new attention mechanism is better than the old one, various forms of comparison must be possible.
The “Golden Screw” An architecture meeting all of the above principles is not likely to be “based on a key principle” or even two – it is very likely to involve a whole set of new and fundamentally foreign principles that make their realization possible!




2018©K. R. Thórisson

EOF

/var/www/cadia.ru.is/wiki/data/attic/public/t_720_atai/atai-18/lecture_notes_w6.1536267210.txt.gz · Last modified: 2024/04/29 13:33 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki