Differences

This shows you the differences between two versions of the page.

--- public:t-713-mers:mers-24:causation-methodology-architecture [2024/10/29 11:52] – thorisson
+++ public:t-713-mers:mers-24:causation-methodology-architecture [2024/11/05 11:35] (current) – thorisson
@@ Line 12: / Line 12: @@
-====Worlds & Regularity====
+=====Worlds & Regularity=====
 |  Noise  | A world with no regularity is a completely unpredictable world. \\ In such worlds, learning is impossible.   |
@@ Line 22: / Line 22: @@
 |  Regularity Means Logic  | Regularity - even partial - implies rules. Learning in such worlds involves extracting the rules.   |
 |  The Role of Reasoning  | In such worlds, the role of reasoning is to extract the rules in a **compact and convenient form**, so that it may be used for **simulating the world** in cognition.    |
+|  **Causal Models**   | The direction of the "causal arrow" is critically necessary for guiding action of an intelligent autonomous agent. \\ Knowing which way the arrows point in any large set of correlated variables can be found out through empirical experimentation.   |
+|  What Causal Models Enable  | Causal models enable the creation of a coherent set of rules, where each new rule supports or contradicts others. The more coherent that a picture of the world painted this way looks, the more likely it is to be a **useful model of the world**.    |
 \\
 \\
-====Causation====
+=====Causation=====
 |  Deduction = Prediction  | Since logical correlation is sufficient to produce a prediction.    |
 |  Correlation Supports Prediction  | Correlation is sufficient for simple prediction (if **A** and **B**  correlate highly, then it does not matter if we see an **A** //OR// a **B**, we can predict that the other is likely on the scene).    |
-|  \\ Knowledge of Causation Supports Action  | We may know that **A** and **B** correlate, but if we don't know whether **B** is a result of **A** or vice versa, and we want **B** to disappear, we don't know whether it will suffice to modify **A**. \\ //Example: The position of the light switch and the state of the light bulb correlate. Only by knowing that the light switch controls the bulb can we go directly to the switch if we want the light to turn on.  //    |
+|  Abduction = Planning  | Abduction cannot proceed without knowledge of causal relations.    |
-|  **Causal Models** \\ Are Necessary To Guide Action  | While correlation gives us indication of causation, the direction of the "causal arrow" is critically necessary for guiding action. \\ Luckily, knowing which way the arrows point in any large set of correlated variables is usually not too hard to find out, by empirical experimentation.   |
+|  Knowledge of Causation Supports Action  | We may know that **A** and **B** correlate, but if we don't know whether **B** is a result of **A** or vice versa, and we want **B** to disappear, we don't know whether it will suffice to modify **A**. \\ //Example: The position of the light switch and the state of the light bulb correlate. Only by knowing that the light switch controls the bulb can we go directly to the switch if we want the light to turn on.//  \\ Causation allows producing chains of events (if **A** causes **B**, **A** can be used to produce **B**). If we know how to affect **A**, we can achieve **B**. \\ Causal knowledge subsumes correlational knowledge: Correlation can be produced from cause-effect information, but not vice versa.   |
-|  Judea Pearl   | Most Fervent Advocate of causality in AI, and the inventor of the Do Calculus. \\ C.f. [[https://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf|Bayesianism and Causality, or, Why I am Only a Half-Beyesian]].    |
+|  Knowledge of Causation Supports Understanding  | If we know a collection of cause-effect relations related to a phenomenon, we can both predict, explain, achieve goals, and even re-create the phenomenon.   |
-|  \\ State Of The Art  | Recent work by Judea Pearl demonstrates clearly the fallaciousness of the statistical stance, and fixes some important gaps in our knowledge on this subject which hopefully will rectify the situation in the coming years. \\ [[https://www.youtube.com/watch?v=8nHVUFqI0zk|YouTube lecture by J. Pearl on causation]].   |
+|  **Causal Models:** \\ Necessary To Guide Action  | While correlation gives us indication of causation, the direction of the "causal arrow" is critically necessary for guiding action. \\ Luckily, knowing which way the arrows point in any large set of correlated variables is usually not too hard to find out, by empirical experimentation.   |
 \\
@@ Line 39: / Line 40: @@
-==== Self-Programming ====
+===== Self-Programming =====
-|  \\ What it is  | //Self-programming// here means, with respect to some virtual machine <m>M</m>, the production of one or more programs created by <m>M</m> itself, whose //principles// for creation were provided to <m>M</m> at design time, but whose details were //decided by// <m>M</m> at runtime, based on its //experience//.  |
+|  \\ What it is  | //Self-programming// here means, with respect to some virtual machine **M**, the production of one or more programs created by **M** itself, whose //principles// for creation were provided to **M** at design time, but whose details were //decided by// **M** at runtime, based on its //experience//.  |
 |  Self-Generated Program  | \\ Determined by some factors in the interaction between the system and its environment.   |
 |  Historical note  | Concept of self-programming is old (J. von Neumann one of the first to talk about self-replication in machines). However, few if any proposals for how to achieve this has been fielded.  [[https://en.wikipedia.org/wiki/Von_Neumann_universal_constructor|Von Neumann's universal constructor on Wikipedia]]   |
 |  No guarantee  | The fact that a system has the ability to program itself is not a guarantee that it is in a better position than a traditional system. In fact, it is in a worse situation because in this case there are more ways in which its performance can go wrong.    |
-|  Why we need it  | The inherent limitations of hand-coding methods make traditional manual programming approaches unlikely to reach a level of a human-grade generally intelligent system, simply because to be able to adapt to a wide range of tasks, situations, and domains, a system must be able to modify itself in more fundamental ways than a traditional software system is capable of.   |
+|  Why needed  | The inherent limitations of hand-coding methods make traditional manual programming approaches unlikely to reach a level of a human-grade generally intelligent system, simply because to be able to adapt to a wide range of tasks, situations, and domains, a system must be able to modify itself in more fundamental ways than a traditional software system is capable of.   |
 |  Remedy  | Sufficiently powerful principles are needed to insure against the system going rogue.    |
 |  \\ The //Self// of a machine  | **C1:** The processes that act on the world and the self (via senctors) evaluate the structure and execution of code in the system and, respectively, synthesize new code. \\  **C2:** The models that describe the processes in C1, entities and phenomena in the world -- including the self in the world -- and processes in the self. Goals contextualize models and they also belong to C2. \\ **C3:** The states of the self and of the world -- past, present and anticipated -- including the inputs/outputs of the machine.  |
@@ Line 52: / Line 53: @@
 \\
-==== Programming for Self-Programming ====
+===== Programming for Self-Programming =====
 |  \\ Why Self-Programming?  | Building a machine that can write (sensible, meaningful!) programs means that that machine is smart enough to **understand** (to a pragmatically meaningful level) the code it produces. If the purpose of its programming is to //become// smart, and the programming language we give to it //assumes it's smart already//, we have defeated the purpose of creating the self-programming machine that gets smarter over time, because its operation requires that its's already smart.    |
 |  How Can We Program \\ for Self-Programming?   | \\ Self-programming involves automatic code writing. Code that is automatically written must be verifiable (non-axiomatically, i.e. no mathematical proofs!); therefore, only programming languages that allow reflection will work.     |
-|  \\ Can we use LISP? \\ (or related)  | Any language with similar features as LISP (e.g. Haskel, Prolog, Python, etc.), i.e. the ability to inspect itself, turn data into code and code into data, should //in theory// be capable of sustaining a self-programming machine. (That is because no theory of intelligence exists that takes **time pressure** (limited time and energy - LTE) properly into their account of intelligence.)  |
+|  Can we use Python?   | Any fully reflective language (like LISP, Haskel, Prolog, Python, etc.), i.e. which comes with the ability to inspect itself, turn data into code and code into data, should //in theory// be capable of sustaining a self-programming machine. (That is because no theory of intelligence exists that takes **time pressure** (limited time and energy - LTE) properly into their account of intelligence.)  |
-|  \\ Theory vs. practice  | "In theory" is most of the time //not good enough// if we want to see something soon (as in the next decade or two), and this is the case here too; what is good for a human programmer is not so good for a system having to synthesize its own code in real-time - in a way that makes its behavior **temporally predictable**. \\ Why is that important? Because the world presents deadlines, and if the controller is not capable of temporally predictable behavior deadlines cannot be dealt with properly by that controller.  |
+|  \\ Theory vs. practice  | In computer science, seeing a potential solution to something "in theory" is, most of the time, //not good enough// if we want to see something tangible in the next decade or two, and this is the case here too. \\ In the case of self-programming, what may be great for a human programmer is not good enough for a system supposed to synthesize its own code in real-time - in a way that makes its behavior **temporally predictable**. \\ Why is that important? Because the world presents deadlines, and if the controller is not capable of temporally predictable behavior deadlines cannot be dealt with properly by that controller.  |
 |  What can we do?  | We must create a programming language with //simple enough// semantics so that a simple machine (perhaps with some clever emergent properties) can use it to bootstrap itself in learning to write programs.  |
 |  Does such a language exist?  | Yes. It's called [[http://alumni.media.mit.edu/~kris/ftp/nivel_thorisson_replicode_AGI13.pdf|Replicode]].   |
@@ Line 64: / Line 65: @@
 \\
-====Levels of Self-Programming====
+=====Levels of Self-Programming=====
 |  Level 1  | Level one self-programming capability is the ability of a system to make programs that exclusively make use of its primitive actions from action set.  |
 |  Level 2  | Subsumes Level 1; additionally generates new primitives.   |
@@ Line 70: / Line 71: @@
 |  Infinite regress?  | Though the process of self-programming can be carried out in more than one level, eventually the regress will stop at a certain level. The more levels are involved, the more flexible the system will be, though at the same time it will be less stable and more complicated to be analyzed.   |
 |  Likely to be many ways?  | For AGI the set of relevant self-programming approaches is likely to be a much smaller set than that typically discussed in computer science, and in all likelihood much smaller than often implied in AGI.    |
-|  \\ Architecture  | The possible solutions for effective and efficient self-programming are likely to be strongly linked to what we generally think of as the //architectural structure// of AI systems, since self-programming for AGI may fundamentally have to change, modify, or partly duplicate, some aspect of the architecture of the system, for the purpose of being better equipped to perform some task or set of tasks.   |
+|  \\ Architecture  | The possible solutions for effective and efficient self-programming are likely to be strongly linked to what we generally think of as the //architectural structure// of AI systems, since self-programming for AGI may fundamentally have to change, modify, or partly duplicate, some aspect of the architecture of the system itself, for the purpose of being better equipped to perform some task or set of tasks.   |
+|  What is Needed  | What is NOT needed is a system that spews out rule after rule after rule, filling up a giant database of rules. That misses the point because what is needed, for any //particular// situation, is a //particular// reasoning chain -- in other words, we need **customized reasoning**  \\ What is neede d    |
+|  Achieving **Customized Reasoning**   | What is called for is the equivalent of a just-in-time-compiler but for //reasoning//: A reasoner that produces exactly the kind of reasoning needed for the particular situation. This would be the most compact way for creating logically consistent results where trustworthiness is //part of the reasoning//.     |
+|  Trustworthiness Requires Meta-Reasoning  | Trustworthiness of reasoning output can only be done with knowledge of the reliability of the rules used, in other words, //rules about the rules//. This means that meta-reasoning is an inseparable part of the reasoning.      |
 \\
 \\
-====Existing Systems Which Target Self-Programming====
+=====Existing Systems Which Target Self-Programming=====
 ^  Label  ^  What  ^  Example  ^Description^
 |  \\ [S]  |  \\ State-space search   |  \\ GPS (Newell et al. 1963)  | The atomic actions are state-changing operators, and a program is represented as a path from the initial state to a final state. Variants of this approach include program search (examples: Gödel Machine (Schmidhuber 2006)): Given the action set A, in principle all programs formed by it can be exhaustively listed and evaluated to find an optimal one according to certain criteria.   |
@@ Line 89: / Line 93: @@
 \\
-====Design Assumptions in The Above Approaches====
+=====Design Assumptions in The Above Approaches=====
 |  \\ How does the system represent a basic action?  | a) As an operator that transforms a state to another state, either deterministically or probably, and goal as state to be reached [R, S] \\ b) As a function that maps some input arguments to some output arguments [G] \\ c) As a realizable statement with preconditions and consequences [A, E, I, P] \\ Relevant assumptions: \\ Is the knowledge about an action complete and certain? \\ Is the action set discrete and finite?   |
 |  \\ Can a program be used as an "action" in other programs?  | a) Yes, programs can be built recursively [A, E, G, I] \\ b) No, a program can only contain basic actions [R, S, P] \\ Relevant assumptions: \\  Do the programs and actions form a hierarchy? \\ Can these recursions have closed loops?  |
@@ Line 103: / Line 107: @@
-==== Predictability ====
+===== Predictability =====
 |  What It Is  | The ability of an outsider to predict the behavior of a controller based on some information.   |
@@ Line 110: / Line 114: @@
 \\
-====Reliability====
+=====Reliability=====
 |  What It Is  | The ability of a machine to always return the same - or similar - answer to the same input.   |
@@ Line 124: / Line 128: @@
 \\
-====Trustworthiness====
+=====Trustworthiness=====
 |  What It Is  | The ability of a machine's owner to trust that the machine will do what it is supposed to do.   |