Differences

This shows you the differences between two versions of the page.

--- public:t_720_atai:atai-18:lecture_notes_knowledge_representations [2018/09/26 10:47] – [Important Concepts for AGI] thorisson
+++ public:t_720_atai:atai-18:lecture_notes_knowledge_representations [2024/04/29 13:33] (current) – external edit 127.0.0.1
@@ Line 5: / Line 5: @@
 =====T-720-ATAI-2018=====
-====Lecture Notes, W9: Knowledge Representation & Architectures for Generality====
+====Lecture Notes, W9: Knowledge Representation & Meaning====
 \\
 \\
@@ Line 21: / Line 21: @@
 |  Attention  | The management of processing, memory, and sensory resources.   ||
 |  Meta-Cognition  | The ability of a system to reason about itself.  ||
-|  \\ Understanding  | Understanding has been neglected in AI and AGI; Modern AI systems do not understand. \\ Yet the concept seems crucial when talking about human intelligence; the concept holds explanatory power - we do not assign responsibilities for a task who has a significant lack of understanding of the task. Moreover, the level of understanding can be evaluated. \\ Understanding of a particular phenomenon <m>phi</m> is the potential to perform actions and answer questions with respect to <m>phi</m>. Example: Is an automobile heavier or lighter than a human?     ||
+|  \\ Understanding  | The phenomenon of "understanding" has been neglected in AI and AGI. Modern AI systems do not //understand//. \\ Yet the concept seems crucial when talking about human intelligence; the concept holds explanatory power - we do not assign responsibilities for a task to someone or something with a demonstrated lack of understanding of the task. Moreover, the level of understanding can be evaluated. \\ Understanding of a particular phenomenon <m>phi</m> is the potential to perform actions and answer questions with respect to <m>phi</m>. Example: Is an automobile heavier or lighter than a human?     ||
 |  | \\ Explanation  | When performed by an agent, the ability to transform knowledge about X from a formulation primarily (or only) good for execution with respect to X to a formulation good for being communicated (typically involving some form of linearization, incremental introduction of concepts and issues, in light of an intended receiving agent with a particular a-priori knowledge). \\ Is it possible to explain something that you don't understand?  |
 |  \\ Learning   | Acquisition of information in a form that enables more successful completion of tasks. We call information in such a form "knowledge" or "practical knowledge". (There is also the concept of "impractical knowledge", which sometimes people feel must be the case of "useless trivia" that seems to be useless for anything, but can in fact turn out to be useful at any point, as for instance using such trivia to wow others with one's knowledge of trivia.)   ||
@@ Line 53: / Line 53: @@
 \\
 \\
-\\
+====So, What Are Models?====
-\\
-====System Architecture====
+|  Model  | A model of something is an information structure that behaves in some ways like the thing being modeled. \\ ‘Model’ here actually means exactly the same as the word when used in the vernacular — look up any dictionary defnition and that is what it means. A model of //something// is not the thing itself, it is in some way a ‘mirror image’ of it, typically with some unimportant details removed, and represented in a way that allows for various //manipulations// for the purpose of //making predictions// (//answering questions//), where the form of allowed manipulations are particular to the representation of the model and the questions to be answered.      |
-|  What it is  | In CS: the organization of the software that implements a system.  \\ In AI: The total system that has direct and independent control of the behavior of an Agent via its sensors and effectors.   |
+|  Example  | A model of Earth sits on a shelf in my daughter’s room. With it I can answer questions about the gross layout of continents, and names assigned to various regions as they were around 1977 (because that’s when I got it for my confirmation :-) ).  A model requires a //process for using// it. In this example that process is humans that can read and manipulate smallish objects.     |
-|  Why it's important  | The system architecture determines what kind of information processing can be done, and what the system as a whole is capable of in a particular Task-Environemnt.   |
+|  Computational Models  | A typical type of question to be answered with computational (mathematical) models are what-if questions, and a typical method of manipulation is running simulations (producing deductions). Along with this we need the appropriate computational machine.    |
-|  Key concepts  | process types; process initiation; information storage; information flow.  |
+|  \\ Model (again)  | A 'model' in this conception has a target phenomenon that it applies to, and it has a form of representation, comprehensiveness, and level of detail; these are the primary features that determine what a model is good for. A computational model of the world in raw machine-readable form is not very efficient for quickly identifying all the countries adjacent to Switzerland - for that a traditional globe is much better.   |
-|  Graph representation  | Common way to represent processes as nodes, information flow as edges.  |
+|  Model Acquisition  | The ability to create models of (observed) phenomena.   |
-|  \\ Relation to AI  | The term "system" not only includes the processing components, the functions these implement, their input and output, and relationships, but also temporal aspects of the system's behavior as a whole. This is important in AI because any controller of an agent is supposed to control it in such a way that its behavior can be classified as being "intelligent". But what are the necessary and sufficient components of that behavior set?   |
-|   \\ Rationality   | The "rationality hypothesis" models an intelligent agent as a "rational" agent: An agent that would always do the most "sensible" thing at any point in time. \\ The problem with the rationality hypothesis is that given insufficient resources, including time, the concept of rationality doesn't hold up, because it assumes you have time to weigh all alternatives (or, if you have limited time, that you can choose to evaluate the most relevant options and choose among those). But since such decisions are always about the future, and we cannot predict the future perfectly, for most decisions that we get a choice in how to proceed there is no such thing as a rational choice.   |
-|  Satisficing  | Herbert Simon proposed the concept of "satisficing" to replace the concept of "optimizing" when talking about intelligent action in a complex task-environment. Actions that meet a particular minimum requirement in light of a particular goal 'satisfy' and 'suffice' for the purposes of that goal.   |
-|  Intelligence is in part a systemic phenomenon  | Thought experiment: Take any system we deem intelligent, e.g. a 10-year old human, and isolate any of his/her skills and features. A machine that implements any //single// one of these is unlikely to seem worthy of being called "intelligent" (viz chess programs), without further qualification (e.g. "a limited expert in a sub-field"). \\ //"The intelligence **is** the architecture."// - KRTh   |
 \\
 \\
-====CS Architecture Building Blocks====
+====System & Architectural Requirements for Using Models====
-|  \\ Pipes & filters  | Extension of functions. \\ Component: Each component has a set of inputs and a set of outputs. A component reads streams of data on its inputs and produces streams of data on its outputs, delivering a complete instance of the result in a standard order. \\ Pipes: Connectors in a system of such components transmit outputs of one filter to inputs of others.  |
+|  Effectiveness  | Creation of models must be effective - otherwise a system will spend too much time creating useless or bad models. \\ Making the model creation effective may require e.g. parallelizing the execution of operations on them.  |
-|  Object-orientation  | Abstract compound data types with associated operations.  |
+|  Efficiency  | Operations on models listed above must be efficient lest they interfere with the normal operation of the system / agent. \\ One way to achieve temporal efficiency is to parallelize their execution, and make them simple.  |
-|  Event-based invocation  | Pre-defined event types trigger particular computation sequences in pre-defined ways.  |
+|  Scalability  | For any moderately interesting / complex environment, a vast number of models may be entertained and considered at any point in time, and thus a large set of //potential// models must be manipulatable by the system / agent.     |
-|  Layered systems  | System is deliberately separated into layers, a layer being a grouping of one or more sub-functions.   |
-|  Hierarchical systems  | System is deliberately organized into a hierarchy, where the position in the hierarchy represents one or more important (key system) parameters.  |
-|   Blackboards   | System employs a common data store, accessible by more than a single a sub-process of the system (often all).  |
-|  Hybrid architectures  | Take two or more of the above and mix together to suit your tastes.  |
 \\
 \\
+==== Problems with Feedback-Only Controllers====
-====Network Topologies====
+|  \\ \\ Thermostat  | A cooling thermostat has a built-in supersimple model of its task-environment, one that is sufficient for it to do its job. It consists of a few variables, an on-off switch, two thresholds, and two simple rules that tie these together; the sensed temperature variable, the upper threshold for when to turn the heater on, and the lower threshold for when to turn the heater off. The thermostat never has to decide which model is appropriate, it is “baked into it" by the thermostat’s designer. It is not a predictive (forward) model, this is a strict feedback model. \\ The thermostat cannot change its model, this can only be done by the user opening it and twiddling some thumbscrews.    |
-|  Point-to-Point  | Dedicated connection between nodes, shared only by a node at each end.  |
+|  Limitation  | Because the system designer knows beforehand which signals cause perturbations in <m>o</m> and can hard-wire these from the get-go in the thermostat, there is no motivation to create a model-creating controller (it is much harder!).  |
-|  Bus  | A message medium, shared by all nodes (or a subset).  |
+|   Other "state of the art" systems   | The same is true for expert systems, subsumption robots, and general game playing machines: their model is to tightly baked into their architecture by the designer. Yes, there are some variables in these that can be changed automatically “after the machine leaves the lab” (without designer intervention), but they are parameters inside a (more or less) already-determined //model//.    |
-|  Star  | Central node serves as a conduit, forwarding to others; full structure of nodes forming a kind of star.  |
+|  What Can We Do?  | Feed-forward control! Which requires **models**.   |
-|  Ring  | All nodes are connected to only two other nodes.  |
-|  Mesh  | All nodes are connected to all other nodes (fully connected graph); can be relaxed to partially-connected graph.  |
-|  Tree  | Node connections forms a hierarchical tree structure.  |
-|  Pub-Sub  | In publish-subscribe architectures one or more "post offices" receive requests to get certain information from nodes.   |
-|  reference  | [[https://en.wikipedia.org/wiki/Network_topology|Network topology on Wikipedia]]   |
-\\
-\\
-====Coordination Hierarchies====
-|  {{ public:t-720-atai:coordinationhierarchies.png?500 }}   |
-|  A functional hierarchy organizes the execution of tasks according to their functions. A product hierarchy organizes production in little units, each focused on a particular product. \\ Several types of markets exist - here two idealized versions are show, without and with brokers. De-centralized markets require more intelligence to be present in the nodes, which can be aleviated by brokers. Brokers, however, present weak points in the system: If you have a system with only 2 brokers mediating between processors and consumers/buyers, failure in these 2 points will render the system useless. \\ Notice that in a basic program written in C++ every single character is such a potential point of failure, which is why bugs are so common in standard software.  |
 \\
 \\
-====Reactive Agent Architecture====
+==== Benefits of Combined Feed-forward + Feedback Controllers ====
-|  Architecture  | Largely fixed for the entire lifetime of the agent.  |
+|  Ability to Predict  | With the ability to predict comes the ability to deal with events that happen faster than the perception-action cycle of the controller, as well as the ability to anticipate events far into the future.   |
-|  Super simple  | Sensors connected directly to motors. \\ Example: Braitenberg Vehicles.   |
+|  \\ Greater Potential to Learn  | A machine that is free to create, select, and evaluate models operating on observable and hypothesized variables has potential to learn anything (within the confines of the algorithms it has been given for these operations) because as long as the range of possible models is reasonably broad and general, the topics, tasks, domains, and worlds it could (in theory) handle becomes vastly larger than systems where a particular model is given to the system a priori (I say ‘in theory’ because there are other factors, e.g. the ergodicity of the environment and resource constraints that must be favorable to e.g. the system’s speed of learning).   |
-|  Simple  | Deterministic connections between components with small memory. \\ Examples: Chess engines, Roomba vacuum cleaner.   |
+|  Greater Potential for Cognitive Growth  | A system that can build models of its own model creation, selection, and evaluation has the ability to improve its own nature. This is in some sense the ultimate AGI (depending on the original blueprint, original seed, and some other factors of course) and therefore we only need two levels of this, in theory, for a self-evolving potentially omniscient/omnipotent (as far as the universe allows) system.   |
-|  Complex  | Grossly modular architecture (< 30 modules) with multiple relationships at more than one level of control detail (LoC). \\  Example: Speech-controlled dialogue systems like Siri and Alexa.   |
+|  Bottom Line  | AGI without both feed-forward and feed-back mechanisms is fairly unthinkable.    |
-|  Super Complex  | Large number of modules (> 30) at various sizes, each with multiple relationships to others, at more than one LoC. \\ Example: Subsumption architecture.   |
 \\
 \\
-==== Requirements For an AGI Architecture ====
-|  \\ Large architecture  | An architecture that is considerably larger and more complex than systems being built in AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around //speed of execution// but around the //nature of the architectural principles of the system and their runtime operation//.   ||
-|  \\ Predictable Robustness in Novel Circumstances  | The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity. This robustness must be predictable a-priori at some level of abstraction -- for a wide range of novel circumstances it cannot be a complete surprise that the system "holds up". (If this were the case then the system itself would not be able to predict its chances of success in face of novel circumstances, thus eliminating an important part of the "G" from its "AGI" label.)   ||
-|  \\ Graceful Degradation  | Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic (and upredictable) failure. A programmer forgets to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems operating in partially stochastic environments, where perturbations may come in any form at any time (and perfect prediction is impossible).   ||
-|  Transversal Functions  | The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection.   ||
-|  | \\ Transversal Time  | Ignoring (general) temporal constraints is not an option if we want AGI. Move over Turing! Time is a semantic property, and the system must be able to understand – and be able to //learn to understand// – time as a real-world phenomenon in relation to its own skills and architectural operation. Time is everywhere, and is different from other resources in that there is a global clock which cannot, for many task-environments, be turned backwards. Energy must also be addressed, but may not be as fundamentally detrimental to ignore as time while we are in the early stages of exploring methods for developing auto-catalytic knowledge acquisition and cognitive growth mechanisms. \\ Time must be a tightly integrated phenomenon in any AGI architecture - managing and understanding time cannot be retrofitted to a complex architecture!    |
-|  | \\ Transversal Learning   | The system should be able to learn anything and everything, which means learning probably not best located in a particular "module" or "modules" in the architecture. \\ Learning must be a tightly integrated phenomenon in any AGI architecture, and must be part of the design from the beginning - implementing general learning into an existing architecture is out of the question: Learning cannot be retrofitted to a complex architecture!   |
-|  | Transversal Resource Management  | Resource management - //attention// - must be tightly integrated.\\ Attention must be part of the system design from the beginning - retrofitting resource management into a architecture that didn't include this from the beginning is next to impossible!   |
-|  | Transversal Analogies  | Analogies must be included in the system design from the beginning - retrofitting the ability to make general analogies between anything and everything is impossible!   |
-|  | \\ Transversal Self-Inspection  | Reflectivity, as it is known, is a fundamental property of knowledge representation. The fact that we humans can talk about the stuff that we think about, and can talk about the fact that we talk about the fact that we can talk about it, is a strong implication that reflectivity is a key property of AGI systems. \\ Reflectivity must be part of the architecture from the beginning - retrofitting this ability into any architecture is virtually impossible!   |
-|  | Transversal Integration  | A general-purpose system must tightly and finely coordinate a host of skills, including their acquisition, transitions between skills at runtime, how to combine two or more skills, and transfer of learning between them over time at many levels of temporal and topical detail.  |
 \\
 \\