Both sides previous revisionPrevious revisionNext revision | Previous revision |
public:t-720-atai:atai-21:ai_architectures [2021/09/29 16:07] – [NARS] thorisson | public:t-720-atai:atai-21:ai_architectures [2024/04/29 13:33] (current) – external edit 127.0.0.1 |
---|
| \\ Large architecture | From the above we can readily infer that if we want GMI, an architecture that is considerably more complex than systems being built in most AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around //speed of execution// but around the //nature of the architectural principles of the system and their runtime operation//. || | | \\ Large architecture | From the above we can readily infer that if we want GMI, an architecture that is considerably more complex than systems being built in most AI labs today is likely unavoidable. In a complex architecture the issue of concurrency of processes must be addressed, a problem that has not yet been sufficiently resolved in present software and hardware. This scaling problem cannot be addressed by the usual “we’ll wait for Moore’s law to catch up” because the issue does not primarily revolve around //speed of execution// but around the //nature of the architectural principles of the system and their runtime operation//. || |
| \\ Predictable Robustness in Novel Circumstances | The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity. This robustness must be predictable a-priori at some level of abstraction -- for a wide range of novel circumstances it cannot be a complete surprise that the system "holds up". (If this were the case then the system itself would not be able to predict its chances of success in face of novel circumstances, thus eliminating an important part of the "G" from its "AGI" label.) || | | \\ Predictable Robustness in Novel Circumstances | The system must have a robustness in light of all kinds of task-environment and embodiment perturbations, otherwise no reliable plans can be made, and thus no reliable execution of tasks can ever be reached, no matter how powerful the learning capacity. This robustness must be predictable a-priori at some level of abstraction -- for a wide range of novel circumstances it cannot be a complete surprise that the system "holds up". (If this were the case then the system itself would not be able to predict its chances of success in face of novel circumstances, thus eliminating an important part of the "G" from its "AGI" label.) || |
| \\ Graceful Degradation | Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic (and upredictable) failure. A programmer forgets to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems operating in partially stochastic environments, where perturbations may come in any form at any time (and perfect prediction is impossible). || | | \\ Graceful Degradation | No general autonomous system operating in the physical world for any length of time can perform flawlessly throughout its lifetime. Part of the robustness requirement is that the system be constructed in a way as to minimize potential for catastrophic (and upredictable, unrecoverable) failure. \\ A programmer forgets to delimit a command in a compiled program and the whole application crashes; this kind of brittleness is not an option for cognitive systems operating in partially stochastic environments, where perturbations may come in any form at any time (and perfect prediction is impossible). \\ One way for a cogntive system to achieve graceful degradation is through reflection, which enables it to learn, over time, about its own fallacies, shortcomings, and lack of knowledge. || |
| Transversal Functions | The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection. || | | Transversal Functions | The system must have pan-architectural characteristics that enable it to operate consistently as a whole, to be highly adaptive (yet robust) in its own operation across the board, including metacognitive abilities. Some functions likely to be needed to achieve this include attention, learning, analogy-making capabilities, and self-inspection. || |
| | \\ Transversal Time | Ignoring (general) temporal constraints is not an option if we want AGI. (Move over Turing!) Time is a semantic property, and the system must be able to understand – and be able to //learn to understand// – time as a real-world phenomenon in relation to its own skills and architectural operation. Time is everywhere, and is different from other resources in that there is a global clock which cannot, for many task-environments, be turned backwards. Energy must also be addressed, but may not be as fundamentally detrimental to ignore as time while we are in the early stages of exploring methods for developing auto-catalytic knowledge acquisition and cognitive growth mechanisms. \\ Time must be a tightly integrated phenomenon in any AGI architecture - managing and understanding time cannot be retrofitted to a complex architecture! | | | | \\ Transversal Time | Ignoring (general) temporal constraints is not an option if we want AGI. (Move over Turing!) Time is a semantic property, and the system must be able to understand – and be able to //learn to understand// – time as a real-world phenomenon in relation to its own skills and architectural operation. Time is everywhere, and is different from other resources in that there is a global clock which cannot, for many task-environments, be turned backwards. Energy must also be addressed, but may not be as fundamentally detrimental to ignore as time while we are in the early stages of exploring methods for developing auto-catalytic knowledge acquisition and cognitive growth mechanisms. \\ Time must be a tightly integrated phenomenon in any AGI architecture - managing and understanding time cannot be retrofitted to a complex architecture! | |
\\ | \\ |
====Features of SOAR==== | ====Features of SOAR==== |
| Large Architecture | Yes | Comparatively large. SOAR is as "large" as they come (or was - equally large cognitive architectures are getting more common). | | |
| Predictable Robustness in Novel Circumstances | Not really | Since SOAR isn't really designed to operate and learn in novel circumstances, but rather work under variations of what it already knows, this issue hardly comes up. | | | Predictable Robustness in Novel Circumstances | Not really | Since SOAR isn't really designed to operate and learn in novel circumstances, but rather work under variations of what it already knows, this issue hardly comes up. | |
| Graceful Degradation | No | | | | Graceful Degradation | No | The knowledge representation of SOAR is not organized around safe, predictable, or trustworthy operation (SOAR is an //early// experimental architecture). Since SOAR cannot do reflection, and SOAR doesn't learn, there is no way for SOAR to get better about evaluating its own performance over time, with experience. | |
| \\ Transversal Functions | \\ No | //Transversal Handling of Time.// No explicit handling of time. \\ //Transversal Learning.// Learning is not a central design target of SOAR; reinforcement learning available as an afterthought. No model-based learning; reasoning present (but highly limited). \\ //Transversal Analogies.// No \\ //Transversal Self-Inspection.// Hardly. \\ //Transversal Skill Integration.// We would be hard-pressed to see any such mechanisms. | | | \\ Transversal Functions | \\ No | //Transversal Handling of Time.// No explicit handling of time. \\ //Transversal Learning.// Learning is not a central design target of SOAR; reinforcement learning available as an afterthought. No model-based learning; reasoning present (but highly limited). \\ //Transversal Analogies.// No \\ //Transversal Self-Inspection.// Hardly. \\ //Transversal Skill Integration.// We would be hard-pressed to see any such mechanisms. | |
| \\ Symbolic? | \\ CHECK | One of the main features of SOAR is being symbol oriented. However, the symbols do not have very rich semantics as they are limited to simple sentences; few if any mechanisms exist to manage large sets of symbols and statements: The main operations of SOAR are at the level of a dozen sentences or less. | | | \\ Symbolic? | \\ CHECK | One of the main features of SOAR is being symbol oriented. However, the symbols do not have very rich semantics as they are limited to simple sentences; few if any mechanisms exist to manage large sets of symbols and statements: The main operations of SOAR are at the level of a dozen sentences or less. | |
| How does it work | Reasoning engine does autonomous learning based on what is "experienced". Experirience must be encoded in NARSese - NARS's native knowledge representation language. | | | How does it work | Reasoning engine does autonomous learning based on what is "experienced". Experirience must be encoded in NARSese - NARS's native knowledge representation language. | |
| Recent Versions | ONA (OpenNARS for Applications) implements a version of NARS that is closer to what we might want for controlling robots ("regular" NARS is more of a "philosopher" than an "engineer"). | | | Recent Versions | ONA (OpenNARS for Applications) implements a version of NARS that is closer to what we might want for controlling robots ("regular" NARS is more of a "philosopher" than an "engineer"). | |
| Missing in Action | A clearer approach for handling continuous information. | | | Missing in Action | An approach for handling continuous information. \\ A more explicit way for reasoning control (resource management). | |
| \\ \\ General Description | TBD \\ \\ REF: TBD | | | \\ \\ General Description | NARS is fundamentally different from traditional reasoning systems, mainly because of its assumption of insufficient knowledge and resources. The mathematical logic came from the study of theorem proving in mathematics, where the domain knowledge is summarized in axioms, the inference rules are truth-preserving, and the resource cost of an inference process is ignored, as far as it is finite. The logic of NARS is named "Non-Axiomatic Logic" (NAL), because none of the knowledge it processes (as premise or conclusion) can be considered as "axiom", as with a fixed truth-value. Instead, the system's beliefs are summaries of the system's experience, and are always revisable. \\ In NARS, a "term" names a "concept" that represents a recurring pattern in the system's experience, and a "statement" represents the substitutability of one term to another. Each statement is "true" to a degree, indicating the evidential support the statement gets from available evidence. An inference rule specifies how new statements can be derived from certain existing statements. The memory and control mechanism of the system attempt to use the time-space resources of the system in the most efficient way by dynamically distributing the resources according to the experience of the system and the current context. The overall architecture and working cycle of NARS is explained here. \\ NARS is adaptive to its experience, and therefore is situated and embodied. Its beliefs summarize the system's experience (rather than describe the world as it is), and its concepts represent patterns in the experience (rather than denote the objects in the world). Its inference rules are valid, because each conclusion is supported by the evidence provided by the premises (rather than because they derive absolute truth from absolute truth). The system is rational, because its conclusions are the best the system can find under the current knowledge and resources restriction (rather than because they are always absolutely correct or optimal). \\ \\ REF: https://cis.temple.edu/~pwang/NARS-Intro.html | |
| |
\\ | \\ |
====Features of NARS==== | ====Features of NARS==== |
| Predictable Robustness in Novel Circumstances | Yes | NARS is explicitly designed to operate and learn novel things in novel circumstances. It is the only architecture (besides AERA) that is directly based on, and specifically designed to, an **assumption of insufficient knowledge and resources** (AIKR). | | | Predictable Robustness in Novel Circumstances | \\ Yes | NARS is explicitly designed to operate and learn novel things in novel circumstances. It is the only architecture (besides AERA) that is directly based on, and specifically designed to, an **assumption of insufficient knowledge and resources** (AIKR). | |
| Graceful Degradation | Yes | | | | Graceful Degradation | \\ Yes | While the knowledge representation of NARS is not specifically aimed at achieving safe, predictable, or trustworthy operation, NARS can do reflection, so NARS could learn to get better about evaluating its own performance over time, which means it would be increasingly knowledgeable about its failure modes, making it increasingly more likley to fail gracefully. | |
| \\ Transversal Functions | \\ Yes | //Transversal Handling of Time.// Time is handled in a very general and relative manner, like any other reasoning. \\ //Transversal Learning.// Learning is a central design target of NARS. While knowledge in NARS is not explicitly model-based, its knowledge is symbolic and NARSese statements can be thought of as micro-models; reasoning is a fundamental (some would say the only) principle of its operation. \\ //Transversal Analogies.// Yes \\ //Transversal Self-Inspection.// Yes. Via reasoning. \\ //Transversal Skill Integration.// Yes. Via reasoning. | | | \\ Transversal Functions | \\ Yes | //Transversal Handling of Time.// Time is handled in a very general and relative manner, like any other reasoning. \\ //Transversal Learning.// Learning is a central design target of NARS. While knowledge in NARS is not explicitly model-based, its knowledge is symbolic and NARSese statements can be thought of as micro-models; reasoning is a fundamental (some would say the only) principle of its operation. \\ //Transversal Analogies.// Yes \\ //Transversal Self-Inspection.// Yes. Via reasoning. \\ //Transversal Skill Integration.// Yes. Via reasoning. | |
| \\ Symbolic? | \\ CHECK | One of the main features of NARS is deep symbol orientation. | | | \\ Symbolic? | \\ CHECK | One of the main features of NARS is deep symbol orientation. | |
| \\ Models? | No \\ (but yes) | Any good controller of a system is model of that system. The smallest unit in NARS that could be called 'models' are NARSese statements. | | | \\ Models? | No \\ (but yes) | Any good controller of a system is model of that system. The smallest unit in NARS that could be called 'models' are NARSese statements. | |
| |
| \\ |
| \\ |
====AERA==== | ====AERA==== |
| Description | The Auto-Catalytic Endogenous Reflective Architecture is an AGI-aspiring self-programming system that combines reactive, predictive and reflective control in a model-based and model-driven system that is programmed with a seed. | | | Description | The Auto-Catalytic Endogenous Reflective Architecture is an AGI-aspiring self-programming system that combines reactive, predictive and reflective control in a model-based and model-driven system that is programmed with a seed. | |
| **FIG 1.** High-level view of the three main functions at work in a running AERA system and their interaction with its knowledge store. || | | **FIG 1.** High-level view of the three main functions at work in a running AERA system and their interaction with its knowledge store. || |
| \\ Models | All models are stored in a central //memory//, and the three processes of //planning//, //attention// (resource management) and //learning// happen as a result of programs that operate on models by matching, activating, and scoring them. Models that predict correctly -- not just "what happens next?" but also "what will happen if I do X?" -- get a success point. Every time a model 'fires' like that it gets counted, so the ratio of success over counts gives you the "goodness" of a model. \\ Models that have the lowest scores are deleted, models with a good score that suddenly fail result in the generation of new versions of itself (think of it as hypotheses for why it failed this time), and this process over time increases the quality and utility of the knowledge of the controller, in other words it //learns//. | | | \\ Models | All models are stored in a central //memory//, and the three processes of //planning//, //attention// (resource management) and //learning// happen as a result of programs that operate on models by matching, activating, and scoring them. Models that predict correctly -- not just "what happens next?" but also "what will happen if I do X?" -- get a success point. Every time a model 'fires' like that it gets counted, so the ratio of success over counts gives you the "goodness" of a model. \\ Models that have the lowest scores are deleted, models with a good score that suddenly fail result in the generation of new versions of itself (think of it as hypotheses for why it failed this time), and this process over time increases the quality and utility of the knowledge of the controller, in other words it //learns//. | |
| \\ Attention | Attention is nothing more than resource management, in the case of cognitive controllers it typically involves management of knowledge, time, energy, and computing power. Attention in AERA is the set of functions that decides how the controller uses its compute time, how long it "mulls things over", and how far into the future it allows itself to "think". It also involves which models the system works with at any point in time, how much it explores models outside of the obvious candidate set at any point in time. | | | \\ Attention | Attention is nothing more than resource management, in the case of cognitive controllers it typically involves management of knowledge, time, energy, and computing power. Attention in AERA is the set of functions that decides how the controller uses its compute time, how long it "mulls things over", and how far into the future it allows itself to "think". It also involves which models the system works with at any point in time, how much it explores models outside of the obvious candidate set at any point in time. | |
| \\ Planning | Planning is the set of operations involved with looking at alternative ways of proceeding, based on predictions into the future and the quality of the solutions found so far, at any point in time. The plans produced by AERA are of a mixed opportunistic (short time horizon)/firm commitment (long time horizon) kind, and their stability (subject to change drastically over their course) depend solely on the dependability of the models involved -- i.e. how well the models represent what is actually going on in the world (including the controllers "mind"). | | | \\ Planning | Planning is the set of operations involved with looking at alternative ways of proceeding, based on predictions into the future and the quality of the solutions found so far, at any point in time. The plans produced by AERA are of a mixed opportunistic (short time horizon)/firm commitment (long time horizon) kind, and their stability (subject to change drastically over their course) depend solely on the dependability of the models involved -- i.e. how well the models represent what is actually going on in the world (including the controllers "mind"). | |
| Learning | Learning happens as a result of the accumulation of models; as they increasingly describe "reality" better (i.e. their target phenomenon) they get better for planning and attention, which in turn improves the learning. | | | Learning | Learning happens as a result of the accumulation of models; as they increasingly describe "reality" better (i.e. their target phenomenon) they get better for planning and attention, which in turn improves the learning. | |
| \\ Memory | AREA's "global knowledge base" is in some ways similar to the idea of blackboards: AERA stores all its knowledge in a "global workspace" or memory. Unlike (Selfridge's idea of) blackboards, the blackboard contains executive functions that manage the knowledge dynamically, in addition to "the experts", which in AERA's case are very tiny and better thought of as "models with codelet helpers". | | | \\ Memory | AREA's "global knowledge base" is in some ways similar to the idea of blackboards: AERA stores all its knowledge in a "global workspace" or memory. Unlike (Selfridge's idea of) blackboards, the blackboard contains executive functions that manage the knowledge dynamically, in addition to "the experts", which in AERA's case are very tiny and better thought of as "models with codelet helpers". | |
| Pervasive Use of Codelets | A //codelet// is a piece of code that is smaller than a typical self-contained program, typically a few lines long, and can only be executed in particular contexts. Programs are constructed on the fly by the operation of the whole system selecting which codelets to run when, based on the knowledge of the system, the active goals, and the state it finds itself in at any point in time. | | | Pervasive Use of Codelets | A //codelet// is a piece of code that is smaller than a typical self-contained program, typically a few lines long, and can only be executed in particular contexts. Programs are constructed on the fly by the operation of the whole system selecting which codelets to run when, based on the knowledge of the system, the active goals, and the state it finds itself in at any point in time. | |
| \\ No "Modules" | Note that the diagram above may imply the false impression that AERA consists of these four software "modules", or "classes", or the like. Nothing could be further from the truth: All of AERA's mechanism above are a set of functions that are "welded in with" the operation of the whole system, distributed in a myriad of mechanisms and actions. \\ Does this mean that AERA is spaghetti code, or a mess of a design? On the contrary, the integration and overlap of various mechanisms to achieve the high-level functions depicted in the diagram are surprisingly clean, simple, and coherent in their implementation and operation. \\ This does not mean, however, that AERA is easy to understand -- mainly because it uses concepts and implements mechanisms and relies on concepts that are //very different// from most traditional software systems commonly recognized in computer science. | | | \\ No "Modules" | Note that the diagram above may imply the false impression that AERA consists of these four software "modules", or "classes", or the like. Nothing could be further from the truth: All of AERA's mechanism above are a set of functions that are "welded in with" the operation of the whole system, distributed in a myriad of mechanisms and actions. \\ Does this mean that AERA is spaghetti code, or a mess of a design? On the contrary, the integration and overlap of various mechanisms to achieve the high-level functions depicted in the diagram are surprisingly clean, simple, and coherent in their implementation and operation. \\ This does not mean, however, that AERA is easy to understand -- mainly because it uses concepts and implements mechanisms and relies on concepts that are //very different// from most traditional software systems commonly recognized in computer science. \\ \\ Example Demonstration \\ [[https://www.youtube.com/watch?v=2NQtEJbQCdw|Human-human interaction]] (what S1 observes and learns from) \\ [[https://www.youtube.com/watch?v=SH6tQ4fgWA4|Human-S1 interaction]] (S1 interviewing a human) \\ [[https://www.youtube.com/watch?v=x96HXLPLORg|S1-Human Interaction]] (S1 being interviewed by a human) | |
| |
\\ | \\ |
====Features of AERA==== | ====Features of AERA==== |
| Large Architecture | Yes | Comparatively large. AERA is as "large" as they come (or was - equally large cognitive architectures are getting more common). | | |
| Predictable Robustness in Novel Circumstances | \\ Yes | \\ Since AERA's learning is goal driven, its target operational environment are (semi-)novel circumstances. | | | Predictable Robustness in Novel Circumstances | \\ Yes | \\ Since AERA's learning is goal driven, its target operational environment are (semi-)novel circumstances. | |
| Graceful Degradation | | | | | Graceful Degradation | Yes | Knowledge representation in AERA is based around causal relations, which are essential for mapping out "how the world works". Because AERA's knowledge processing is organized around goals, with increased knowledge AERA will get closer and closer to "perfect operation" (i.e. meeting its top-level drives/goals, for which each instance was created). Furthermore, AERA can do reflection, so it gets better at evaluating its own performance over time, meaning it makes (causal) models of its own failure modes, increasing its chances of graceful degradation. | |
| \\ Transversal Functions | \\ Yes | //Transversal Handling of Time.// Time is transversal. \\ //Transversal Learning.// Yes. Learning can happen at the smallest level as well as the largest, but generally learning proceeds in small increments. Model-based learning is built in; ampliative (mixed) reasoning is present. \\ //Transversal Analogies.// Yes, but remains to be developed further. \\ //Transversal Self-Inspection.// Yes. AERA can inspect a large part of its internal operations (but not everything). \\ //Transversal Skill Integration.// Yes. This follows naturally from the fact that all models are sharable between anything and everything that AERA learns and does. | | | \\ Transversal Functions | \\ Yes | //Transversal Handling of Time.// Time is transversal. \\ //Transversal Learning.// Yes. Learning can happen at the smallest level as well as the largest, but generally learning proceeds in small increments. Model-based learning is built in; ampliative (mixed) reasoning is present. \\ //Transversal Analogies.// Yes, but remains to be developed further. \\ //Transversal Self-Inspection.// Yes. AERA can inspect a large part of its internal operations (but not everything). \\ //Transversal Skill Integration.// Yes. This follows naturally from the fact that all models are sharable between anything and everything that AERA learns and does. | |
| \\ Symbolic? | \\ CHECK | One of the main features of AERA is that its knowledge is declarable by being symbol-oriented. AERA can learn language in the same way it learns anything else (i.e. goal-directed, pragmatic). AERA has been implemented to handle 20k models, but so far the most complex demonstration used only approx 1400 models. | | | \\ Symbolic? | \\ CHECK | One of the main features of AERA is that its knowledge is declarable by being symbol-oriented. AERA can learn language in the same way it learns anything else (i.e. goal-directed, pragmatic). AERA has been implemented to handle 20k models, but so far the most complex demonstration used only approx 1400 models. | |