Differences

This shows you the differences between two versions of the page.

--- public:t-709-aies-2024:aies-2024:next-gen-ai-requirements [2024/09/15 08:49] – [Trustworthiness] thorisson
+++ public:t-709-aies-2024:aies-2024:next-gen-ai-requirements [2024/09/15 09:14] (current) – thorisson
@@ Line 6: / Line 6: @@
 ====== REQUIREMENTS FOR NEXT-GEN AI ======
+//Autonomy, Cause-Effect Knowledge, Cumulative Learning, Empirical Reasoning, Trustworthiness//
 \\
@@ Line 66: / Line 67: @@
 |  Explanation Depends on Causation  | No explanation is without reference to causes; discernible causal structure is a prerequisite for explainability.   |
 |  \\ Bottom Line for \\ Human-Level AI  | To grow and learn and self-inspect an AI must be able to sort out causal chains. If it can't it will not only be incapable of explaining to others why it is like it is, it will be incapable of explaining to itself why things are the way they are, and thus, it will be incapable of sorting out whether something it did is better for its own growth than something else. Explanation is the big black hole of ANNs: In principle ANNs are black boxes, and thus they are in principle unexplainable - whether to themselves or others. \\ One way to address this is by encapsulating knowledge as hierarchical models that are built up over time, and can be de-constructed at any time (like AERA does).   |
+\\
+\\
+==== Self-Explaining Systems ====
+|  What It Is  | The ability of a controller to explain, after the fact or before, why it did something or intends to do it.   |
+|  'Explainability' \\ ≠ \\ 'self-explanation'  | If an intelligence X can explain a phenomenon Y, Y is 'explainable' by Y, through some process chosen by Y. \\ \\ In contrast, if an intelligence X can explain itself, its own actions, knowledge, understanding, beliefs, and reasoning, it is capable of self-explanation. The latter is stronger and subsumes the former.   |
+|  Why It Matters  | If a controller does something we don't want it to repeat - e.g. crash an airplane full of people (in simulation mode, hopefully!) - it needs to be able to explain why it did what it did. If it can't, it means it - and //we// - can never be sure of why it did what it did, whether it had any other choice, whether it is likely to do it again, whether it's an evil machine that actually meant to do it, or even how likely it is to do it again.     |
+|  Why It Matters \\ More Than You Think | The 'Explanation Hypothesis' (ExH) states that explanation is in fact a fundamental element in all advanced learning, because explanation is a way to weed out alternative (and incorrect) hypotheses about how the world works. For instance, if the knowledge already exists in a controller to do the right thing -- for the right //reason// -- in an emergency situation, the //explanation// of why it does what it does //already exists embedded in its knowledge//. \\ See [[https://proceedings.mlr.press/v159/thorisson22b/thorisson22b.pdf|Thórisson 2022]]     |
+|  \\ Human-Level AI  | Even more importantly, to grow and learn and self-inspect the AI system must be able to sort out causal chains. If it can't it will not only be incapable of explaining to others why it is like it is, it will be incapable of explaining to itself why things are the way they are, and thus, it will be incapable of sorting out whether something it did is better for its own growth than something else. Explanation is the big black hole of ANNs: In principle ANNs are black boxes, and thus they are in principle unexplainable - whether to themselves or others. \\ One way to address this is by encapsulating knowledge as hierarchical models that are built up over time, and can be de-constructed at any time (like AERA does).   |
 \\
@@ Line 73: / Line 85: @@
 |  What It Is  | The ability of a machine's owner to trust that the machine will do what it is supposed to do.   |
-|  Why It Matters  | Any machine created by humans is created for a **purpose**. The more reliably it does its job (and nothing else) and does it well, the more trustworthy it is. Trusting simple machines like thermostats involves mostly durability, since they have very few open variables (unbound variables at time of manufacture), their task is well defined and well known, and their reasonably precise operation can be ensured with simple engineering.   |
+|  \\ Why It Matters  | Any machine created by humans is created for a **purpose**. The more reliably it does its job (and nothing else) and does it well, the more trustworthy it is. Trusting simple machines like thermostats involves mostly durability, since they have very few open variables (unbound variables at time of manufacture), their task is well defined and well known, and their reasonably precise operation can be ensured with simple engineering.   |
 |  AI  | In contrast to simple machines, AI is supposed to handle diversity in one or more tasks. A learning AI system goes one step further by leaving the machine's **tasks** undefined at manufacturing time. The smarter an AI system is, the more diversity it can handle. A requirement should be that "trustworthiness grows with the mindpower of the machine".    |
 |  \\ Human-Level AI  | To make human-level AI trustworthy is very different from creating simple machines because so many variables are unbound at manufacture time. What does trustworthiness mean in this context? We can look at human trustworthiness: Numerous methods exist for ensuring trustworthiness (license to drive, air traffic controller training, certification programs, etc.). We can have the same certification programs for all humans because their principles of operation are shared at multiple levels of detail (biology, sociology, psychology). For an AI this is different because the variability in the makeup of the machines is enormous. This makes trustworthiness of AI robots a complex issue.   |