| |
public:t-720-atai:atai-20:ai_architectures [2020/10/12 19:37] – [SOAR] thorisson | public:t-720-atai:atai-20:ai_architectures [2024/04/29 13:33] (current) – external edit 127.0.0.1 |
---|
| |
====Refresher: System Architecture==== | ====Refresher: System Architecture==== |
| What it is | In CS: the organization of the software that implements a system. \\ In AI: The total system that has direct and independent control of the behavior of an Agent via its sensors and effectors. | | | \\ What it is | In CS: the organization of the software that implements a system. \\ In AI: The total system that has direct and independent control of the behavior of an Agent via its sensors and effectors. | |
| Why it's important | The system architecture determines what kind of information processing can be done, and what the system as a whole is capable of in a particular Task-Environemnt. | | | Why it's important | The system architecture determines what kind of information processing can be done, and what the system as a whole is capable of in a particular Task-Environemnt. | |
| Key concepts | process types; process initiation; information storage; information flow. | | | Key concepts | process types; process initiation; information storage; information flow. | |
| \\ Relation to AI | The term "system" not only includes the processing components, the functions these implement, their input and output, and relationships, but also temporal aspects of the system's behavior as a whole. This is important in AI because any controller of an agent is supposed to control it in such a way that its behavior can be classified as being "intelligent". But what are the necessary and sufficient components of that behavior set? | | | \\ Relation to AI | The term "system" not only includes the processing components, the functions these implement, their input and output, and relationships, but also temporal aspects of the system's behavior as a whole. This is important in AI because any controller of an agent is supposed to control it in such a way that its behavior can be classified as being "intelligent". But what are the necessary and sufficient components of that behavior set? | |
| \\ Rationality | The "rationality hypothesis" models an intelligent agent as a "rational" agent: An agent that would always do the most "sensible" thing at any point in time. \\ The problem with the rationality hypothesis is that given insufficient resources, including time, the concept of rationality doesn't hold up, because it assumes you have time to weigh all alternatives (or, if you have limited time, that you can choose to evaluate the most relevant options and choose among those). But since such decisions are always about the future, and we cannot predict the future perfectly, for most decisions that we get a choice in how to proceed there is no such thing as a rational choice. | | | \\ Rationality | The "rationality hypothesis" models an intelligent agent as a "rational" agent: An agent that would always do the most "sensible" thing at any point in time. \\ The problem with the rationality hypothesis is that given insufficient resources, including time, the concept of rationality doesn't hold up, because it assumes you have time to weigh all alternatives (or, if you have limited time, that you can choose to evaluate the most relevant options and choose among those). But since such decisions are always about the future, and we cannot predict the future perfectly, for most decisions that we get a choice in how to proceed there is no such thing as a rational choice. | |
| Satisficing | Herbert Simon proposed the concept of "satisficing" to replace the concept of "optimizing" when talking about intelligent action in a complex task-environment. Actions that meet a particular minimum requirement in light of a particular goal 'satisfy' and 'suffice' for the purposes of that goal. | | | \\ Satisficing | Herbert Simon proposed the concept of "satisficing" to replace the concept of "optimizing" when talking about intelligent action in a complex task-environment. Actions that meet a particular minimum requirement in light of a particular goal 'satisfy' and 'suffice' for the purposes of that goal. | |
| Intelligence is in part a systemic phenomenon | Thought experiment: Take any system we deem intelligent, e.g. a 10-year old human, and isolate any of his/her skills and features. A machine that implements any //single// one of these is unlikely to seem worthy of being called "intelligent" (viz chess programs), without further qualification (e.g. "a limited expert in a sub-field"). \\ //"The intelligence **is** the architecture."// - KRTh | | | \\ Intelligence is in part a systemic phenomenon | Thought experiment: Take any system we deem intelligent, e.g. a 10-year old human, and isolate any of his/her skills and features. A machine that implements any //single// one of these is unlikely to seem worthy of being called "intelligent" (viz chess programs), without further qualification (e.g. "a limited expert in a sub-field"). \\ //"The intelligence **is** the architecture."// - KRTh | |
| |
\\ | \\ |
====Refresher: Inferred GMI Architectural Features ==== | ====Refresher: Inferred GMI Architectural Features ==== |
| \\ Large architecture | From the above we can readily infer that if we want GMI, an architecture that is considerably more complex than systems being built in most AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around //speed of execution// but around the //nature of the architectural principles of the system and their runtime operation//. || | | \\ Large architecture | From the above we can readily infer that if we want GMI, an architecture that is considerably more complex than systems being built in most AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around //speed of execution// but around the //nature of the architectural principles of the system and their runtime operation//. || |
| Predictable Robustness in Novel Circumstances | The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity. This robustness must be predictable a-priori at some level of abstraction -- for a wide range of novel circumstances it cannot be a complete surprise that the system "holds up". (If this were the case then the system itself would not be able to predict its chances of success in face of novel circumstances, thus eliminating an important part of the "G" from its "AGI" label.) || | | \\ Predictable Robustness in Novel Circumstances | The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity. This robustness must be predictable a-priori at some level of abstraction -- for a wide range of novel circumstances it cannot be a complete surprise that the system "holds up". (If this were the case then the system itself would not be able to predict its chances of success in face of novel circumstances, thus eliminating an important part of the "G" from its "AGI" label.) || |
| \\ Graceful Degradation | Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic (and upredictable) failure. A programmer forgets to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems operating in partially stochastic environments, where perturbations may come in any form at any time (and perfect prediction is impossible). || | | \\ Graceful Degradation | Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic (and upredictable) failure. A programmer forgets to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems operating in partially stochastic environments, where perturbations may come in any form at any time (and perfect prediction is impossible). || |
| Transversal Functions | The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection. || | | Transversal Functions | The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection. || |
====Features of SOAR==== | ====Features of SOAR==== |
| Large Architecture | Yes | Comparatively large. SOAR is as "large" as they come (or was - equally large cognitive architectures are getting more common). | | | Large Architecture | Yes | Comparatively large. SOAR is as "large" as they come (or was - equally large cognitive architectures are getting more common). | |
| Predictable Robustness in Novel Circumstances | Maybe | | | | Predictable Robustness in Novel Circumstances | Not really | Since SOAR isn't really designed to operate and learn in novel circumstances, but rather work under variations of what it already knows, this issue hardly comes up. | |
| Graceful Degradation | No | | | | Graceful Degradation | No | | |
| \\ Transversal Functions | \\ No | //Transversal Handling of Time.// No explicit handling of time. \\ //Transversal Learning.// Learning is not a central design target of SOAR; reinforcement learning available as an afterthought. No model-based learning; reasoning present (but highly limited). \\ //Transversal Analogies.// No \\ //Transversal Self-Inspection.// Hardly. //Transversal Skill Integration.// We would be hard-pressed to see any such mechanisms. | | | \\ Transversal Functions | \\ No | //Transversal Handling of Time.// No explicit handling of time. \\ //Transversal Learning.// Learning is not a central design target of SOAR; reinforcement learning available as an afterthought. No model-based learning; reasoning present (but highly limited). \\ //Transversal Analogies.// No \\ //Transversal Self-Inspection.// Hardly. \\ //Transversal Skill Integration.// We would be hard-pressed to see any such mechanisms. | |
| Symbolic? | CHECK | One of the main features of SOAR is being symbol oriented. However, the symbols do not have very rich semantics as they are limited to simple sentences; few if any mechanisms exist to manage large sets of symbols and statements: The main operations of SOAR are at the level of a dozen sentences or less. | | | \\ Symbolic? | CHECK | One of the main features of SOAR is being symbol oriented. However, the symbols do not have very rich semantics as they are limited to simple sentences; few if any mechanisms exist to manage large sets of symbols and statements: The main operations of SOAR are at the level of a dozen sentences or less. | |
| Models? | No \\ (but yes) | Any good controller of a system is model of that system. It is, however, unclear what kinds of models SOAR creates. While similar to NARS in its approach, SOAR is axiomatic (i.e. does not have obvious ways for knowledge grounding) and thus it is hard to see how it would improve or modify its knowledge over time. | | | \\ Models? | No \\ (but yes) | Any good controller of a system is model of that system. It is, however, unclear what kinds of models SOAR creates. While similar to NARS in its approach, SOAR is axiomatic (i.e. does not have obvious ways for knowledge grounding) and thus it is hard to see how it would improve or modify its knowledge over time. | |
| |
\\ | \\ |
====Limitations of SOAR==== | |
| Learning | Continuous Learning | | | |
| | Life-Long Learning | | | |
| | Life-Long Learning | | | |
| Models | ----- | It is unclear what kinds of models SOAR creates. While similar | | |
| Introspection | ----- | It is unclear what kinds of models SOAR creates. While similar | | |
| Modeling | ----- | It is unclear what kinds of models SOAR creates. While similar | | |
| |
\\ | |
| |
====AERA==== | ====AERA==== |
| Description | The Auto-Catalytic Endogenous Reflective Architecture is an AGI-aspiring self-programming system that combines reactive, predictive and reflective control in a model-based and model-driven system that is programmed with a seed. | | | Description | The Auto-Catalytic Endogenous Reflective Architecture is an AGI-aspiring self-programming system that combines reactive, predictive and reflective control in a model-based and model-driven system that is programmed with a seed. | |
| {{/public:t-720-atai:aera-high-level-2018.png?600}} || | | {{/public:t-720-atai:aera-high-level-2018.png?600}} || |
| High-level view of the three main functions at work in a running AERA system and their interaction with its knowledge store. || | | **FIG 1.** High-level view of the three main functions at work in a running AERA system and their interaction with its knowledge store. || |
| \\ Models | All models are stored in a central //memory//, and the three processes of //planning//, //attention// (resource management) and //learning// happen as a result of programs that operate on models by matching, activating, and scoring them. Models that predict correctly -- not just "what happens next?" but also "what will happen if I do X?" -- get a success point. Every time a model 'fires' like that it gets counted, so the ratio of success over counts gives you the "goodness" of a model. \\ Models that have the lowest scores are deleted, models with a good score that suddenly fail result in the generation of new versions of itself (think of it as hypotheses for why it failed this time), and this process over time increases the quality and utility of the knowledge of the controller, in other words it //learns//. | | | \\ Models | All models are stored in a central //memory//, and the three processes of //planning//, //attention// (resource management) and //learning// happen as a result of programs that operate on models by matching, activating, and scoring them. Models that predict correctly -- not just "what happens next?" but also "what will happen if I do X?" -- get a success point. Every time a model 'fires' like that it gets counted, so the ratio of success over counts gives you the "goodness" of a model. \\ Models that have the lowest scores are deleted, models with a good score that suddenly fail result in the generation of new versions of itself (think of it as hypotheses for why it failed this time), and this process over time increases the quality and utility of the knowledge of the controller, in other words it //learns//. | |
| \\ Attention | Attention is nothing more than resource management, in the case of cognitive controllers it typically involves management of knowledge, time, energy, and computing power. Attention in AERA is the set of functions that decides how the controller uses its compute time, how long it "mulls things over", and how far into the future it allows itself to "think". It also involves which models the system works with at any point in time, how much it explores models outside of the obvious candidate set at any point in time. | | | \\ Attention | Attention is nothing more than resource management, in the case of cognitive controllers it typically involves management of knowledge, time, energy, and computing power. Attention in AERA is the set of functions that decides how the controller uses its compute time, how long it "mulls things over", and how far into the future it allows itself to "think". It also involves which models the system works with at any point in time, how much it explores models outside of the obvious candidate set at any point in time. | |
| \\ Planning | Planning is the set of operations involved with looking at alternative ways of proceeding, based on predictions into the future and the quality of the solutions found so far, at any point in time. The plans produced by AERA are of a mixed opportunistic (short time horizon)/firm commitment (long time horizon) kind, and their stability (subject to change drastically over their course) depend solely on the dependability of the models involved -- i.e. how well the models represent what is actually going on in the world (including the controllers "mind"). | | | \\ Planning | Planning is the set of operations involved with looking at alternative ways of proceeding, based on predictions into the future and the quality of the solutions found so far, at any point in time. The plans produced by AERA are of a mixed opportunistic (short time horizon)/firm commitment (long time horizon) kind, and their stability (subject to change drastically over their course) depend solely on the dependability of the models involved -- i.e. how well the models represent what is actually going on in the world (including the controllers "mind"). | |
| Learning | Learning happens as a result of the accumulation of models; as they increasingly describe "reality" better (i.e. their target phenomenon) they get better for planning and attention, which in turn improves the learning. | | | Learning | Learning happens as a result of the accumulation of models; as they increasingly describe "reality" better (i.e. their target phenomenon) they get better for planning and attention, which in turn improves the learning. | |
| |
\\ | \\ |
| ====Features of AERA==== |
==== General Form of AERA Models ==== | | Large Architecture | Yes | Comparatively large. AERA is as "large" as they come (or was - equally large cognitive architectures are getting more common). | |
| | Predictable Robustness in Novel Circumstances | Yes | Since AERA's learning is goal driven, its target operational environment are (semi-)novel circumstances. | |
| \\ \\ Bi-Directional Causal-Relational Models | {{public:t-720-atai:screenshot_2019-10-20_17.07.15.png?300}} | | | Graceful Degradation | | | |
| | This model, Model_M, predicts that if you see variables 6 and 7 you will see variable 4 some time later (AERA models refer to specific times - simplified here for convenience). | | | \\ Transversal Functions | \\ Yes | //Transversal Handling of Time.// Time is transversal. \\ //Transversal Learning.// Yes. Learning can happen at the smallest level as well as the largest, but generally learning proceeds in small increments. Model-based learning is built in; ampliative (mixed) reasoning is present. \\ //Transversal Analogies.// Yes, but remains to be developed further. \\ //Transversal Self-Inspection.// Yes. AERA can inspect a large part of its internal operations (but not everything). \\ //Transversal Skill Integration.// Yes. This follows naturally from the fact that all models are sharable between anything and everything that AERA learns and does. | |
| \\ Deduction | Models in AERA have a left-hand-side (LHS) and a right-hand-side (RHS). \\ Example Model_M above: Read from left-to-right they state that "if you see what is in the LHS then I predict what you see in the RHS". | | | \\ Symbolic? | \\ CHECK | One of the main features of AERA is that its knowledge is declarable by being symbol-oriented. The symbols do not have very rich semantics: AERA can learn language in the same way it learns anything else (e.g. goal-directed, pragmatic). AERA has been implemented to handle 20k models, but so far the most complex demonstration uses only approx 1400 models. | |
| \\ Abduction | When read right-to-left they say "If you want what is on the LHS try getting what is on the LHS first". The latter is a way to produce sub-goals via abduction; the former is a way to predict the future via deduction. \\ Model_M above: Read from right to left (backward chaining - BWD) it states that if you want <m>V_4</m> you should try to obtain <m>V_6</m> and <m>V_7</m>. | | | Models? | Yes | Explicit model building is the main learning mechanism. | |
| We call such models "bi-directional causal-relational models" because they can be read in either direction and they model the relations (including causal relations) between variables. \\ In AERA, models can reference other models on either side and can include patterns on either side. \\ In case the //values// of variables on either side matter for the other side we use functions that belong to the model that compute these values (In example Model_M, LHS->RHS transformation functions might take the value of <m>V_6</m> or <m>V7</m> and use that to compute the value of <m>V4</m>). \\ Due to the bi-directionality of CRMs we must have bi-directional functions for this purpose as well. (For instance, if you want to open the door you must push down the handle first, then pull the door towards you; if you pull the door towards you with the handle pushed down then the door will open. The amount of pulling will determine the amount the door is ajar - this can be computed via a function relating the LHS to the RHS.) || | |
| |
\\ | \\ |
| |
| |
====Model Acquisition Function in AERA==== | |
| {{public:t-720-atai:agent-with-model-gen-function1.png?300}} || | |
| The agent has a model generation function <m>P_M</m> implemented in its controller. The role of the function is to take observed chains of events and produce models intended to capture the events' causal relationships. || | |
| {{public:t-720-atai:causal-chain_agent1.png?400}} || | |
| A learning agent is situated so as to perceive the effects of the relationships between variables. \\ The agent observes the interaction between the variables for a while, rendering some data about their relations (but not enough to be certain about it, and certainly not enough to create a complete model of it). \\ This generates hypotheses about the relation between variables, in the form of candidate relational models of the observed events. || | |
| |
| |
\\ | \\ |
| |
==== Model Generation & Evaluation ==== | |
| |
| {{public:t-720-atai:three-models-1.png?400}} | | |
| Based on prior observations, of the variables and their temporal execution in some context, the controller's model generation process <m>P_M</m> may have captured their causal relationship in three alternative models, <m>M_1, M_2, M_3</m>, each slightly but measurably different from the others. Each can be considered a //hypothesis of the actual relationship between the referenced variables//, when in the context provided by <m>V_5, V_6</m>. \\ As an example, we could have a tennis ball's direction <m>V_1</m>, speed <m>V_2</m>, and shape <m>V_3</m> that changes when it hits a wall <m>V_5</m>, according to its relative angle <m>V_6</m> to the wall. | | |
| {{public:t-720-atai:agent-with-models-1.png?300}} | | |
| The agent's model generation mechanisms allow it to produce models of events it sees. Here it creates models (a) <m>M_1</m> and (b) <m>M_2</m>. The usefulness of these models for particular situations and goals can be tested by performing an operation on the world (c ) as prescribed by the models, through backward chaining (abduction). \\ Ideally, when one wants to find on which model is best for a particular situation (goals+environment+state), the most efficient method is an (energy-preserving) intervention that can only leave one as the winner. | | |
| {{public:t-720-atai:model-m2-prime-1.png?150}} | | |
| The feedback (reinforcement) resulting from direct or indirect tests of a model may result in its deletion, rewriting, or some other modification. Here the feedback has resulted in a modified model <m>M{prime}_2</m>. | | |
| |
\\ | \\ |
| |
| |
====AERA Demo==== | |
| |
| TV Interview | In the style of a TV interview, the agent S1 watched two humans engaged in a "TV-style" interview about the recycling of six everyday objects made out of various materials. | | |
| Data | S1 received realtime timestamped data from the 3D movement of the humans (digitized via appropriate tracking methods at 20 Hz), words generated by a speech recognizer, and prosody (fundamental pitch of voice at 60 Hz, along with timestamped starts and stops). | | |
| Seed | The seed consisted of a handful of top-level goals for each agent in the interview (interviewer and interviewee), and a small knowledge base about entities in the scene. | | |
| \\ \\ What Was Given | * actions: grab, release, point-at, look-at (defined as event types constrained by geometric relationships) \\ * stopping the interview clock ends the session \\ * objects: glass-bottle, plastic-bottle, cardboard-box, wodden-cube, newspaper, wooden-cube \\ * objects have properties (e.g. made-of) \\ * interviewee-role \\ * interviewer-role \\ * Model for interviewer \\ * top-level goal of interviewer: prompt interviewee to communicate \\ * in interruption case: an imposed interview duration time limit \\ * Models for interviewee \\ * top-level goal of interviewee: to communicate \\ * never communicate unless prompted \\ * communicate about properties of objects being asked about, for as long as there still are properties available \\ * don’t communicate about properties that have already been mentioned | | |
| \\ \\ What Had To Be Learned | GENERAL INTERVIEW PRINCIPLES \\ * word order in sentences (with no a-priori grammar) \\ * disambiguation via co-verbal deictic references \\ * role of interviewer and interviewee \\ * interview involves serialization of joint actions (a series of Qs and As by each participant) \\ \\ MULTIMODAL COORDINATION & JOINT ACTION \\ * take turns speaking \\ * co-verbal deictic reference \\ * manipulation as deictic reference \\ * looking as deictic reference \\ * pointing as deictic reference \\ \\ INTERVIEWER \\ * to ask a series of questions, not repeating questions about objects already addressed \\ * “thank you” stops the interview clock \\ * interruption condition: using “hold on, let’s go to the next question” can be used to keep interview within time limits \\ \\ INTERVIEWEE \\ * what to answer based on what is asked \\ * an object property is not spoken of if it is not asked for \\ * a silence from the interviewer means “go on” \\ * a nod from the interviewer means “go on” | | |
| \\ \\ Result | After having observed two humans interact in a simulated TV interview for some time, the AERA agent S1 takes the role of interviewee, continuing the interview in precisely the same fasion as before, answering the questions of the human interviewer (see videos HH.no_interrupt.mp4 and HH.no_interrupt.mp4 for the human-human interaction that S1 observed; see HM.no_interrupt_mp4 and HM_interrupt_mp4 for other examples of the skills that S1 has acquired by observation). In the "interrupt" scenario S1 has learned to use interruption as a method to keep the interview from going over a pre-defined time limit. \\ \\ The results are recorded in a set of three videos: \\ [[https://www.youtube.com/watch?v=SH6tQ4fgWA4|Human-human interaction]] (what S1 observes) \\ [[https://www.youtube.com/watch?v=SH6tQ4fgWA4|Human-S1 interaction]] (S1 interviewing a human) \\ [[https://www.youtube.com/watch?v=x96HXLPLORg|S1-Human Interaction]] (S1 being interviewed by a human) | | |
| |
\\ | \\ |
| //2020(c)K.R.Thórisson// |
| |