Both sides previous revisionPrevious revisionNext revision | Previous revision |
public:t-709-aies-2025:aies-2025:trust_explanation_meaning [2025/08/26 08:19] – [Engineered Predictability] thorisson | public:t-709-aies-2025:aies-2025:trust_explanation_meaning [2025/08/28 10:10] (current) – leonard |
---|
[[/public:t-713-mers:mers-25:main|T-713-MERS-2025 Main]] \\ | [[/public:t-709-aies:aies-25:main|T-709-AIES-2025 Main]] \\ |
[[/public:t-713-mers:mers-25:lecture_notes|Link to Lecture Notes]] | [[/public:t-709-aies:aies-25:lecture_notes|Link to Lecture Notes]] |
| |
\\ | \\ |
| Why It Is Important | Predicting the behavior of (semi-) autonomous machines is important if we want to ensure their safe operation, or be sure that they do what we want them to do. | | | Why It Is Important | Predicting the behavior of (semi-) autonomous machines is important if we want to ensure their safe operation, or be sure that they do what we want them to do. | |
| \\ How To Do It | Predicting the future behavior of ANNs (of any kind) is easier if we switch off their learning after they have been trained, because there exists no method for predicting where their development will lead them if they continue to learn after the leave the lab. Predicting ANN behavior on novel input can be done statistically, but there is no way to be sure that novel input will not completely reverse their behavior. There are very few if any methods for giving ANNs the ability to judge the "novelty" of any input, which might to some extent possibly help with this issue. Reinforcement learning addresses this by only scaling to a handful of variables with known max and min. | | | \\ How To Do It | Predicting the future behavior of ANNs (of any kind) is easier if we switch off their learning after they have been trained, because there exists no method for predicting where their development will lead them if they continue to learn after the leave the lab. Predicting ANN behavior on novel input can be done statistically, but there is no way to be sure that novel input will not completely reverse their behavior. There are very few if any methods for giving ANNs the ability to judge the "novelty" of any input, which might to some extent possibly help with this issue. Reinforcement learning addresses this by only scaling to a handful of variables with known max and min. | |
| Correlation \\ vs. Causation | In the physical world, predictability means reliability; reliability cannot be achieved without knowledge of, and reliance on, repeatable causal relationships. \\ // To achieve predictable behavior of engineered artifacts, it is not sufficient to know about correlation.// | | | Correlation \\ vs. Causation | In the physical world, predictability means reliability; reliability cannot be achieved without knowledge of, and reliance on, **repeatable causal relationships**. \\ To achieve predictable behavior of engineered artifacts, correlation alone does not suffice, because it does not allow the engineer to foresee, and engineer out of the system, the potential undesirable categories of results (i.e. effects) of certain (novel) categories of inputs (i.e. causes). | |
| | Predictable AI | There is no way to achieve predictable behavior of a machine that handles novel input without giving the machine itself the ability to judge the input and classify it as "novel". This cannot be done with ANN-based systems because of how their information (I hesitate to call it "knowledge" - see definitions of these concepts in prior lecture notes) is represented. | |
| |
\\ | \\ |
==== Engineered Reliability ==== | ==== Engineered Reliability ==== |
| |
| What It Is | The ability of a machine to always return the same - or categorically similar - answer to the same input. | | | What It Is | The ability of a machine to always return the same - or categorically similar - answer to the same - and/or categorically similar - input. | |
| Why It Is Important | Simple machine learning algorithms are very good in this respect, delivering high reliability. Human-level AI, on the other hand, may have the same limitations as humans in this respect, i.e. not being able to give any guarantees. | | | Why It Is Important | Simple AI algorithms (e.g. reinforcement learning, auto-correlation, decision trees, etc.) are very good in this respect, delivering high reliability. Because they are simple, and their environment is oversimplified, their reliability to can be engineered up front. Near-human-level AI -- or adaptive autonomous machines (AAMs) -- on the other hand, have the same limitations as humans and animals in this respect, i.e. reliability is a challenge, and no guarantees can be given. | |
| Human-Level AI | To make human-level AI reliable is important because a human-level AI without reliability cannot be trusted, and hence would defeat most of the purpose for creating it in the first place. (AERA proposes a method for this - through continuous pee-wee model generation and refinement.) | | | Predictability is Hard to Achieve | In a growing, developing system that is adapting and learning (3 or 4 levels of detail of dynamical relations!) achieving predictability can only be achieved by **abstraction**: Moving up to the next level of detail; e.g. "I cannot be sure //what exactly// I will eat for dinner (one level of detail), but I can be pretty sure that I //will// eat dinner (more coarse-grain level)". | |
| To Achieve Reliability | Requires **predictability**. Predictability requires sorting out //causal relations// (without these we can never be sure what lead to what). | | | Producing Abstraction | Can be done through hierarchy (but must be //dynamic// - i.e. dynamically adjusted to its intended usage, as the circumstances call for - because the world has too complex combinatorics to store precomputed hierarchies for everything). | |
| Predictability is Hard to Achieve | In a growing, developing system that is adapting and learning (3 or 4 levels of dynamics!) achieving predictability can only be achieved by **abstraction**: Going to the next level of detail (e.g. I cannot be sure //what exactly// I will eat for dinner, but I can be pretty sure that I //will// eat dinner). | | | Human-Level AI | To make AAMs reliable is important because without reliability they cannot be trusted, and hence would defeat most of the purpose for creating them in the first place. (One method for addressing this is through autonomous iterative micro-model generation and refinement.) | |
| Achieving Abstraction | Can be done through hierarchy (but it needs to be //dynamic// - i.e. tailored to its intended usage, as the circumstances call for - because the world has too complex combinatorics to store precomputed hierarchies for everything). | | | Achieving Reliabile AI | Requires **predictability**. Predictability requires sorting out //causal relations// (without these, neither we nor the system can never be sure what leads to what, precluding reliability). | |
\\ | |
| |
| \\ |
\\ | \\ |
| |
| What It Is | The ability of a machine's owner to trust that the machine will do what it is supposed to do. | | | What It Is | The ability of a machine's owner to trust that the machine will do what it is supposed to do. | |
| Why It Is Important | Any machine created by humans is created for a purpose. The more reliably it does its job (and nothing else) the more trustworthy it is. Trusting simple machines like thermostats involves mostly durability, since they have very few open variables (unbound variables at time of manufacture). | | | Why It Is Important | Any machine created by humans is created for a purpose. The more reliably it does its job (and nothing else) the more trustworthy it is. Trusting simple machines like thermostats involves mostly durability, since they have very few open variables (unbound variables at time of manufacture). | |
| Human-Level AI | To make human-level AI trustworthy is very different from creating simple machines because so many variables are unbound at manufacture time. What does trustworthiness mean in this context? We can look at human trustworthiness: Numerous methods exist for ensuring trustworthiness (license to drive, air traffic controller training, certification programs, etc.). We can have the same certification programs for all humans because their principles of operation are shared at multiple levels of detail (biology, sociology, psychology). For an AI this is different because the variability in the makeup of the machines is enormous. This makes trustworthiness of AI robots a complex issue. | | | Trustworthiness Methods... | ...for AI are in their infancy. | |
| | Human-Level AI | Making human-level AI trustworthy is very different from creating simple machines because so many variables are unbound at manufacturing time. What does trustworthiness mean in this context? We can look at human trustworthiness: Numerous methods exist for ensuring trustworthiness (license to drive, air traffic controller training, certification programs, etc.). We can have the same certification programs for all humans because their principles of operation are shared at multiple levels of detail (biology, sociology, psychology). \\ For an AI this is different because the variability in the makeup of the machines is enormous. This makes trustworthiness of AI a challenging issue. | |
| To Achieve Trustworthiness | Requires **reliability**, and **predictability** at multiple levels of operation. Trustworthiness can be ascertained through special certification programs geared directly at the **kind of robot/AI system in question** (kind of like certifying a particular horse as safe for a particular circumstance and purpose, e.g. horseback riding kids). | | | To Achieve Trustworthiness | Requires **reliability**, and **predictability** at multiple levels of operation. Trustworthiness can be ascertained through special certification programs geared directly at the **kind of robot/AI system in question** (kind of like certifying a particular horse as safe for a particular circumstance and purpose, e.g. horseback riding kids). | |
| Trustworthiness Methods | For AI are in their infancy. | | |
| |
| |
| \\ |
| \\ |
\\ | \\ |
\\ | \\ |