Both sides previous revisionPrevious revisionNext revision | Previous revision |
public:t-720-atai:atai-20:methodologies [2020/10/14 11:25] – [ConstructiVist AI Methodology (CAIM)] thorisson | public:t-720-atai:atai-20:methodologies [2024/04/29 13:33] (current) – external edit 127.0.0.1 |
---|
==== What It a Methodology? ==== | ==== What It a Methodology? ==== |
| What it is | The methods - tools and techniques - we use to study a phenomenon. | | | What it is | The methods - tools and techniques - we use to study a phenomenon. | |
| Examples | - Comparative experiments (for the answers we want Nature to ultimately give). \\ - Telescopes (for things far away). \\ - Microscopes (for all things smaller than the human eye can see unaided). \\ - Simulations (for complex interconnected systems that are hard to untangle). | | | \\ Examples | - Comparative experiments (for the answers we want Nature to ultimately give). \\ - Telescopes (for things far away). \\ - Microscopes (for all things smaller than the human eye can see unaided). \\ - Simulations (for complex interconnected systems that are hard to untangle). | |
| Why it Matters | Methodology directly determines our progress when studying a phenomenon -- what we do with respect to that phenomenon to figure it out. \\ Methodology affects how we think about a phenomenon, including our solutions, expectations, and imagination. \\ Methodology determines the possible scope of outcomes. \\ Methodology directly influences the shape of our solution - our answers to scientific questions. \\ Methodology directly determines the speed with which we can make progress when studying a phenomenon. \\ //Methodology is therefore a **primary determinant of scientific progress.** // | | | \\ Why it Matters | Methodology directly determines our progress when studying a phenomenon -- what we do with respect to that phenomenon to figure it out. \\ Methodology affects how we think about a phenomenon, including our solutions, expectations, and imagination. \\ Methodology determines the possible scope of outcomes. \\ Methodology directly influences the shape of our solution - our answers to scientific questions. \\ Methodology directly determines the speed with which we can make progress when studying a phenomenon. \\ //Methodology is therefore a **primary determinant of scientific progress.** // | |
| The main AI methodology | AI never really had a proper methodology discussion as part of its mainstream scientific discourse. Only 2 or 3 approaches to AI can be properly called 'methodologies': //BDI// (belief, desire, intention), //subsumption//, //decision theory//. As a result AI inherited the run of the mill CS methodology/ies by default. | | | \\ The main AI methodology | AI never really had a proper methodology discussion as part of its mainstream scientific discourse. Only 2 or 3 approaches to AI can be properly called 'methodologies': //BDI// (belief, desire, intention), //subsumption//, //decision theory//. As a result AI inherited the run of the mill CS methodology/ies by default. | |
| Constructi//on//ist AI | Methods used to build AI systems by hand. | | | Constructi//on//ist AI | Methods used to build AI systems by hand. | |
| Constructi//v//ist AI | Methods aimed at creating AI systems that autonomously generate, manage, and use their knowledge. | | | Constructi//v//ist AI | Methods aimed at creating AI systems that autonomously generate, manage, and use their knowledge. | |
| Applying a Methodology | results in a family of architectures: The methodology "allows" ("sets the stage") for what //should// and //may// be included when we design our architecture. The methodology is the "tool for thinking" about a design space. (Contrast with requirements, which describe the goals and constraints (negative goals)). | | | Applying a Methodology | results in a family of architectures: The methodology "allows" ("sets the stage") for what //should// and //may// be included when we design our architecture. The methodology is the "tool for thinking" about a design space. (Contrast with requirements, which describe the goals and constraints (negative goals)). | |
| Following a Methodology | results in a particular //architecture//. | | | Following a Methodology | results in a particular //architecture//. | |
| CAIM Relies on Models | CAIM takes Conant & Ashby's proof (that every good controller of a system is a model of that system - the Good X Theorem) seriously, putting //models// at its center \\ This stance was prevalent in the early days of AI (first two decades) but fell into disfavor due to behaviorism (in psychology and AI). | | | \\ CAIM Relies on Models | CAIM takes Conant & Ashby's proof (that every good controller of a system is a model of that system - the Good X Theorem) seriously, putting //models// at its center \\ This stance was prevalent in the early days of AI (first two decades) but fell into disfavor due to behaviorism (in psychology and AI). | |
| Example | The Auto-Catalytic Endogenous Reflective Architecture - AERA - is the only architecture to result directly from the application of CAIM. It is //model-based// and //model-driven// (in an even-driven way: The models left-hand terms are matched to situations to determine their relevance at any point in time, when they match their right-hand term is injected into memory - more on this below). | | | \\ Example | The Auto-Catalytic Endogenous Reflective Architecture - AERA - is the only architecture to result directly from the application of CAIM. It is //model-based// and //model-driven// (in an even-driven way: The models left-hand terms are matched to situations to determine their relevance at any point in time, when they match their right-hand term is injected into memory - more on this below). | |
| In Other Words | AERA Models are a way to represent knowledge. \\ But what are models, really, and what might they look like in this context? | | | In Other Words | AERA Models are a way to represent knowledge. \\ But what are models, really, and what might they look like in this context? | |
| |
| |
| \\ Reductionism | The method of isolating parts of a complex phenomenon or system in order to simplify and speed up our understanding of it. \\ In science, when you want to make a new problem tractable, you //reduce// it until it's small enough to be addressable by available methods. \\ Most of the time this reduction proceeds in light of //currently available methods and tools//. \\ Here we can call this "good old fashioned run-of-the-mill reductionism". \\ An enormous number of problems in science have been successfully addressed through this approach. \\ See [[https://en.wikipedia.org/wiki/Reductionism|Reductionism]] on Wikipedia. | | | \\ Reductionism | The method of isolating parts of a complex phenomenon or system in order to simplify and speed up our understanding of it. \\ In science, when you want to make a new problem tractable, you //reduce// it until it's small enough to be addressable by available methods. \\ Most of the time this reduction proceeds in light of //currently available methods and tools//. \\ Here we can call this "good old fashioned run-of-the-mill reductionism". \\ An enormous number of problems in science have been successfully addressed through this approach. \\ See [[https://en.wikipedia.org/wiki/Reductionism|Reductionism]] on Wikipedia. | |
| Does Reductionism Always Work The Same Way? | In short, **no**. \\ For phenomena where //current tools and methods// do not suffice, //new tools and methods// must be developed. \\ (GMI may very well be just such a phenomenon.) | | | Does Reductionism \\ Always Work \\ The Same Way? | In short, **no**. \\ For phenomena where //current tools and methods// do not suffice, //new tools and methods// must be developed. \\ (GMI may very well be just such a phenomenon.) | |
| Occam's Razor | A key principle of reductionism. \\ When faced with two alternative explanations that both explain a phenomenon equally well, choose the //simpler// explanation. \\ See also [[https://en.wikipedia.org/wiki/Occam%27s_razor|Occam's Razor]]. | | | \\ Occam's Razor | A key principle of reductionism. \\ When faced with two alternative explanations that both explain a phenomenon equally well, choose the //simpler// explanation. \\ See also [[https://en.wikipedia.org/wiki/Occam%27s_razor|Occam's Razor]]. | |
| How Should Occam's Razor Cut? | How the principles of Occam's Razor and reductionism are used must be based on the phenomenon under study. \\ Using a telescope to study electricity or voltmeter to study faraway stars may not lead to quick progress. \\ Going wild with Occam's Razor on your subject matter will lead to a bloody mess: Like any powerful tool, its careful application is key to producing beneficial results. | | | How Should Occam's Razor Cut? | How the principles of Occam's Razor and reductionism are used must be based on the phenomenon under study. \\ Using a telescope to study electricity or voltmeter to study faraway stars may not lead to quick progress. \\ Going wild with Occam's Razor on your subject matter will lead to a bloody mess: Like any powerful tool, its careful application is key to producing beneficial results. | |
| |
| |
====How to Study HeLDs Scientifically==== | ====How to Study HeLDs Scientifically==== |
| \\ HeLD | Cannot be studied by the standard application of reductionism/Occam's Razor, because the emergent properties are lost. Instead, corollaries of the system -- while ensuring some commonality to the original system //in toto// -- must be studied to gain insights into the target system. | | | \\ HeLD | Cannot be studied by the standard application of reductionism/Occam's Razor, because some emergent properties are likely to get lost. Instead, corollaries of the system -- while ensuring some commonality to the original system //in toto// -- must be studied to gain insights into the target system. For this we use models and simulations. | |
| Agent & Environment | We try to characterize the agent and its task-environment as two interacting complex systems. If we keep the task-environment constant, the remaining system to study is the agent and its controller. | | |
| {{public:t-720-atai:simple-system1.png}} || | | {{public:t-720-atai:simple-system1.png}} || |
| How to tease apart HeLDs: \\ Finding the boundary between a novel //system// and its //environment// may be done by isolating the smallest number of interaction edges between the sub-systems of the two. || | | How to tease apart HeLDs: \\ //Finding the boundary between a novel //system// and its //environment// may be done by isolating the smallest number of interaction edges between the sub-systems of the two.// || |
| {{public:t-720-atai:system-env-world-1.png}} || | | {{public:t-720-atai:system-env-world-1.png}} || |
| Illustration of the relationship between a system, its task-environment, and its world. Task-environments will always inherit the "laws" of the world; the world puts constraints on the state-space of the task-environment. || | | //Illustration of the relationship between a system, its task-environment, and its world. \\ Task-environments will always inherit the "laws" of the world; the world puts constraints on the state-space of the task-environment.// || |
| | Agent & Environment | We try to characterize the agent and its task-environment as two interacting complex systems. If we keep the task-environment constant, the remaining system to study is the agent and its controller. Together they form a sort of "super-HeLD" because for any learning system the environment is tightly coupled with the agent's seed and learning mechanisms. | |
\\ | \\ |
| |
| \\ Self-Construction | It is assumed that a system must amass the vast majority of its knowledge autonomously. This is partly due to the fact that it is (practically) impossible for any human or team(s) of humans to construct by hand the knowledge needed for an AGI system, and even if this were possible it would still leave unanswered the question of how the system will acquire knowledge of truly novel things, which we consider a fundamental requirement for a system to be called an AGI system. | | | \\ Self-Construction | It is assumed that a system must amass the vast majority of its knowledge autonomously. This is partly due to the fact that it is (practically) impossible for any human or team(s) of humans to construct by hand the knowledge needed for an AGI system, and even if this were possible it would still leave unanswered the question of how the system will acquire knowledge of truly novel things, which we consider a fundamental requirement for a system to be called an AGI system. | |
| Semiotic Opaqueness | No communication between two agents / components in a system can take place unless they share a common language, or encoding-decoding principles. Without this they are semantically opaque to each other. Without communication, no coordination can take place. | | | Semiotic Opaqueness | No communication between two agents / components in a system can take place unless they share a common language, or encoding-decoding principles. Without this they are semantically opaque to each other. Without communication, no coordination can take place. | |
| Systems Engineering | Due to the complexity of building a large system (say, an airplane), a clear and concise bookkeeping of each part, and which parts it interacts with, must be kept so as to ensure the holistic operation of the resulting system. In a (cognitively) growing system in a dynamic world, where the system is auto-generating models of the phenomena that it sees, each which must be tightly integrated yet easily manipulatable and clearly separable, the system must itself ensure the semiotic transparency of its constituents parts. This can only be achieved by automatic mechanisms residing in the system itself, it cannot be ensured manually by a human engineer, or even a large team of them. | | | \\ Systems Engineering | Due to the complexity of building a large system (say, an airplane), a clear and concise bookkeeping of each part, and which parts it interacts with, must be kept so as to ensure the holistic operation of the resulting system. In a (cognitively) growing system in a dynamic world, where the system is auto-generating models of the phenomena that it sees, each which must be tightly integrated yet easily manipulatable and clearly separable, the system must itself ensure the semiotic transparency of its constituents parts. This can only be achieved by automatic mechanisms residing in the system itself, it cannot be ensured manually by a human engineer, or even a large team of them. | |
| Self-Modeling | To enable cognitive growth, in which the cognitive functions themselves improve with training, can only be supported by a self-modifying mechanism based on self-modeling. If there is no model of self there can be no targeted improvement of existing mechanisms. | | | \\ Self-Modeling | To enable cognitive growth, in which the cognitive functions themselves improve with training, can only be supported by a self-modifying mechanism based on self-modeling. If there is no model of self there can be no targeted improvement of existing mechanisms. | |
| Self-Programming | The system must be able to invent, inspect, compare, integrate, and evaluate architectural structures, in part or in whole. | | | Self-Programming | The system must be able to invent, inspect, compare, integrate, and evaluate architectural structures, in part or in whole. | |
| Pan-Architectural Pattern Matching | To enable autonomous //holistic integration// the architecture must be capable of comparing (copies of) itself to parts of itself, in part or in whole, whether the comparison is contrasting structure, the effects of time, or some other aspect or characteristics of the architecture. To decide, for instance, if a new attention mechanism is better than the old one, various forms of comparison must be possible. | | | Pan-Architectural Pattern Matching | To enable autonomous //holistic integration// the architecture must be capable of comparing (copies of) itself to parts of itself, in part or in whole, whether the comparison is contrasting structure, the effects of time, or some other aspect or characteristics of the architecture. To decide, for instance, if a new attention mechanism is better than the old one, various forms of comparison must be possible. | |
==== Current Methodologies: ConstructiONist ==== | ==== Current Methodologies: ConstructiONist ==== |
| Constructionist Methods | A constructionist methodology requires an //intelligent designer// that manually (or via scripts) arranges selected //components// that together makes up a //system of parts// (i.e. architecture) that can act in particular ways. \\ //Examples: automobiles, telephone networks, computers, operating systems, the Internet, mobile phones, apps, etc.// || | | Constructionist Methods | A constructionist methodology requires an //intelligent designer// that manually (or via scripts) arranges selected //components// that together makes up a //system of parts// (i.e. architecture) that can act in particular ways. \\ //Examples: automobiles, telephone networks, computers, operating systems, the Internet, mobile phones, apps, etc.// || |
| | Traditional CS Software Development Methods | On the theoretical side, the majority of mathematical methodologies are of the constructionist kind (with some applied math for natural sciences counting as exceptions). On the practical side, programs and manual invention and implementation of algorithms are all largely hand-coded. \\ Systems creation in CS is "co-owned" by the field of engineering.\\ All programming languages designed under the assumption that they will be used by a human-level programmer. | | | | \\ Traditional CS Software Development Methods | On the theoretical side, the majority of mathematical methodologies are of the constructionist kind (with some applied math for natural sciences counting as exceptions). On the practical side, programs and manual invention and implementation of algorithms are all largely hand-coded. \\ Systems creation in CS is "co-owned" by the field of engineering.\\ All programming languages designed under the assumption that they will be used by a human-level programmer. | |
| | Belief, Desire, Intention | BDI can hardly be called a "methodology" and is more of a framework for inspiration. Picking three terms out of psychology, BDI methods emphasize goals (desire), plans (intention) and revisable knowledge (beliefs), all of which are good and fine. Methodologically speaking, however, none of the basic features of a true scientific methodology (algorithms, systems engineering principles, or strategies) are to be found in papers on this approach. | | | | \\ BDI: Belief, Desire, Intention | BDI can hardly be called a "methodology" and is more of a framework for inspiration. Picking three terms out of psychology, BDI methods emphasize goals (desire), plans (intention) and revisable knowledge (beliefs), all of which are good and fine. Methodologically speaking, however, none of the basic features of a true scientific methodology (algorithms, systems engineering principles, or strategies) are to be found in papers on this approach. | |
| | Subsumption Architecture | This is perhaps the best known AI-specific methodology worthy of being categorized as a 'methodology'. Presented as an "architecture" originally, it is more of an approach that results in architectures where subsumption operating under particular principles are a major organizational feature. | | | | \\ Subsumption Architecture | This is perhaps the best known AI-specific methodology worthy of being categorized as a 'methodology'. Presented as an "architecture" originally, it is more of an approach that results in architectures where subsumption operating under particular principles are a major organizational feature. | |
| Why it's important | Virtually all methodologies we have for creating software are methodologies of the constructionist kind. \\ Unfortunately, few methodologies step outside of that frame. | | | Why it's important | Virtually all methodologies we have for creating software are methodologies of the constructionist kind. \\ Unfortunately, few methodologies step outside of that frame. | |
| |
| |
====Architectural Principles of a CAIM-Developed System (What CAIM Targets) ==== | ====Architectural Principles of a CAIM-Developed System (What CAIM Targets) ==== |
| Self-Construction | It is assumed that a system must amass the vast majority of its knowledge autonomously. This is partly due to the fact that it is (practically) impossible for any human or team(s) of humans to construct by hand the knowledge needed for an AGI system, and even if this were possible it would still leave unanswered the question of how the system will acquire knowledge of truly novel things, which we consider a fundamental requirement for a system to be called an AGI system. | | | \\ Self-Construction | It is assumed that a system must amass the vast majority of its knowledge autonomously. This is partly due to the fact that it is (practically) impossible for any human or team(s) of humans to construct by hand the knowledge needed for an AGI system, and even if this were possible it would still leave unanswered the question of how the system will acquire knowledge of truly novel things, which we consider a fundamental requirement for a system to be called an AGI system. | |
| Baby Machines | To some extent an AGI capable of growing throughout its lifetime will be what may be called a "baby machine", because relative to later stages in life, such a machine will initially seem "baby like". \\ While the mechanisms constituting an autonomous learning baby machine may not be complex compared to a "fully grown" cognitive system, they are nevetheless likely to result in what will seem large in comparison to the AI systems built today, though this perceived size may stem from the complexity of the mechanisms and their interactions, rather than the sheer number of lines of code. | | | \\ Baby Machines | To some extent an AGI capable of growing throughout its lifetime will be what may be called a "baby machine", because relative to later stages in life, such a machine will initially seem "baby like". \\ While the mechanisms constituting an autonomous learning baby machine may not be complex compared to a "fully grown" cognitive system, they are nevetheless likely to result in what will seem large in comparison to the AI systems built today, though this perceived size may stem from the complexity of the mechanisms and their interactions, rather than the sheer number of lines of code. | |
| Semiotic Opaqueness | No communication between two agents / components in a system can take place unless they share a common language, or encoding-decoding principles. Without this they are semantically opaque to each other. Without communication, no coordination can take place. | | | Semiotic Opaqueness | No communication between two agents / components in a system can take place unless they share a common language, or encoding-decoding principles. Without this they are semantically opaque to each other. Without communication, no coordination can take place. | |
| Systems Engineering | Due to the complexity of building a large system (picture, e.g. an airplane), a clear and concise bookkeeping of each part, and which parts it interacts with, must be kept so as to ensure the holistic operation of the resulting system. In a (cognitively) growing system in a dynamic world, where the system is auto-generating models of the phenomena that it sees, each which must be tightly integrated yet easily manipulatable and clearly separable, the system must itself ensure the semiotic transparency of its constituents parts. This can only be achieved by automatic mechanisms residing in the system itself, it cannot be ensured manually by a human engineer, or even a large team of them. | | | \\ Systems Engineering | Due to the complexity of building a large system (picture, e.g. an airplane), a clear and concise bookkeeping of each part, and which parts it interacts with, must be kept so as to ensure the holistic operation of the resulting system. In a (cognitively) growing system in a dynamic world, where the system is auto-generating models of the phenomena that it sees, each which must be tightly integrated yet easily manipulatable and clearly separable, the system must itself ensure the semiotic transparency of its constituents parts. This can only be achieved by automatic mechanisms residing in the system itself, it cannot be ensured manually by a human engineer, or even a large team of them. | |
| Self-Modeling | To enable cognitive growth, in which the cognitive functions themselves improve with training, can only be supported by a self-modifying mechanism based on self-modeling. If there is no model of self there can be no targeted improvement of existing mechanisms. | | | \\ Self-Modeling | To enable cognitive growth, in which the cognitive functions themselves improve with training, can only be supported by a self-modifying mechanism based on self-modeling. If there is no model of self there can be no targeted improvement of existing mechanisms. | |
| Self-Programming | The system must be able to invent, inspect, compare, integrate, and evaluate architectural structures, in part or in whole. | | | Self-Programming | The system must be able to invent, inspect, compare, integrate, and evaluate architectural structures, in part or in whole. | |
| Pan-Architectural Pattern Matching | To enable autonomous //holistic integration// the architecture must be capable of comparing (copies of) itself to parts of itself, in part or in whole, whether the comparison is contrasting structure, the effects of time, or some other aspect or characteristics of the architecture. To decide, for instance, if a new attention mechanism is better than the old one, various forms of comparison must be possible. | | | Pan-Architectural Pattern Matching | To enable autonomous //holistic integration// the architecture must be capable of comparing (copies of) itself to parts of itself, in part or in whole, whether the comparison is contrasting structure, the effects of time, or some other aspect or characteristics of the architecture. To decide, for instance, if a new attention mechanism is better than the old one, various forms of comparison must be possible. | |
==== How AERA Learns ==== | ==== How AERA Learns ==== |
| |
| In a Nutshell | AERA does not only perform things it knows, it can learn //new// things. \\ And when it has learned new things it can yet again learn //more// new things. \\ And any of those new things can be //novel// things. \\ And those novel things can be //fairly different// as well as //highly similar// to what it already knows; an AERA agent can leverage this, to implement what we have called //cumulative learning//. \\ **Learning //a number of diverse novel things// requires something over and beyond what is available through the traditional learning methods: //Hypothesis generation// through //analogy//.** | | | \\ In a Nutshell | AERA does not only perform things it knows, it can learn //new// things. \\ And when it has learned new things it can yet again learn //more// new things. \\ And any of those new things can be //novel// things. \\ And those novel things can be //fairly different// as well as //highly similar// to what it already knows; an AERA agent can leverage this, to implement what we have called //cumulative learning//. \\ **Learning //a number of diverse novel things// requires something over and beyond what is available through the traditional learning methods: //Hypothesis generation// through //analogy//.** | |
| Hypothesis Generation | | | | Hypothesis Generation | To deal with new phenomena it creates //hypotheses// about it - which variables matter, how these are related, how they respond to actions, etc. \\ How these hypotheses are created: \\ 1. Based on correlations between measurements taken in the context of the phenomenon. \\ 2. How similar parts of the phenomenon are to other known phenomena. 'Similarity' is another word for "analogy". | |
| Analogy | | | | Analogy | Analogy is the systematic comparison of two things, where some parts of those things are ignored while others are rated on a scale as to how similar they are. | |
| Model Creation | | | | Model Creation | AERA creates models that capture the relations between variables and other models, esp. causal relations. \\ This makes AERA models very effective for \\ 1. Generating predictions. \\ 2. Creating plans. \\ 3. Explaining how 'things hang together', and \\ 4. Re-creating systems. | |
| |
\\ | \\ |
====AERA Demo==== | ====AERA Demo==== |
| |
| TV Interview | In the style of a TV interview, the agent S1 watched two humans engaged in a "TV-style" interview about the recycling of six everyday objects made out of various materials. The results are recorded in a set of three videos: \\ [[https://www.youtube.com/watch?v=SH6tQ4fgWA4|Human-human interaction]] (what S1 observes and learns from) \\ [[https://www.youtube.com/watch?v=SH6tQ4fgWA4|Human-S1 interaction]] (S1 interviewing a human) \\ [[https://www.youtube.com/watch?v=x96HXLPLORg|S1-Human Interaction]] (S1 being interviewed by a human) | | | TV Interview | In the style of a TV interview, the agent S1 watched two humans engaged in a "TV-style" interview about the recycling of six everyday objects made out of various materials. The results are recorded in a set of three videos: \\ [[https://www.youtube.com/watch?v=2NQtEJbQCdw|Human-human interaction]] (what S1 observes and learns from) \\ [[https://www.youtube.com/watch?v=SH6tQ4fgWA4|Human-S1 interaction]] (S1 interviewing a human) \\ [[https://www.youtube.com/watch?v=x96HXLPLORg|S1-Human Interaction]] (S1 being interviewed by a human) | |
| Data | S1 received realtime timestamped data from the 3D movement of the humans (digitized via appropriate tracking methods at 20 Hz), words generated by a speech recognizer, and prosody (fundamental pitch of voice at 60 Hz, along with timestamped starts and stops). | | | Data | S1 received realtime timestamped data from the 3D movement of the humans (digitized via appropriate tracking methods at 20 Hz), words generated by a speech recognizer, and prosody (fundamental pitch of voice at 60 Hz, along with timestamped starts and stops). | |
| Seed | The seed consisted of a handful of top-level goals for each agent in the interview (interviewer and interviewee), and a small knowledge base about entities in the scene. | | | Seed | The seed consisted of a handful of top-level goals for each agent in the interview (interviewer and interviewee), and a small knowledge base about entities in the scene. | |