User Tools

Site Tools


public:t-709-aies-2025:aies-2025:trust_explanation_meaning

This is an old revision of the document!


T-713-MERS-2025 Main
Link to Lecture Notes



TRUST & EXPLANATIONS




Engineered Predictability

What It Is The ability of an outsider to predict the behavior of a controller based on some information.
Why It Is Important Predicting the behavior of (semi-) autonomous machines is important if we want to ensure their safe operation, or be sure that they do what we want them to do.

How To Do It
Predicting the future behavior of ANNs (of any kind) is easier if we switch off their learning after they have been trained, because there exists no method for predicting where their development will lead them if they continue to learn after the leave the lab. Predicting ANN behavior on novel input can be done statistically, but there is no way to be sure that novel input will not completely reverse their behavior. There are very few if any methods for giving ANNs the ability to judge the “novelty” of any input, which might to some extent possibly help with this issue. Reinforcement learning addresses this by only scaling to a handful of variables with known max and min.
Correlation
vs. Causation
In the physical world, predictability means reliability; reliability cannot be achieved without knowledge of, and reliance on, repeatable causal relationships.
To achieve predictable behavior of engineered artifacts, it is not sufficient to know about correlation.
Predictable AI There is no way to achieve predictable behavior of a machine that handles novel input without giving the machine itself the ability to judge the input and classify it as “novel”. This cannot be done with ANN-based systems because of how their information (I hesitate to call it “knowledge” - see definitions of these concepts in prior lecture notes) is represented.



Engineered Reliability

What It Is The ability of a machine to always return the same - or categorically similar - answer to the same input.
Why It Is Important Simple machine learning algorithms are very good in this respect, delivering high reliability. Human-level AI, on the other hand, may have the same limitations as humans in this respect, i.e. not being able to give any guarantees.
Human-Level AI To make human-level AI reliable is important because a human-level AI without reliability cannot be trusted, and hence would defeat most of the purpose for creating it in the first place. (AERA proposes a method for this - through continuous pee-wee model generation and refinement.)
To Achieve Reliability Requires predictability. Predictability requires sorting out causal relations (without these we can never be sure what lead to what).
Predictability is Hard to Achieve In a growing, developing system that is adapting and learning (3 or 4 levels of dynamics!) achieving predictability can only be achieved by abstraction: Going to the next level of detail (e.g. I cannot be sure what exactly I will eat for dinner, but I can be pretty sure that I will eat dinner).
Achieving Abstraction Can be done through hierarchy (but it needs to be dynamic - i.e. tailored to its intended usage, as the circumstances call for - because the world has too complex combinatorics to store precomputed hierarchies for everything).



Engineered / Earned Trustworthiness

What It Is The ability of a machine's owner to trust that the machine will do what it is supposed to do.
Why It Is Important Any machine created by humans is created for a purpose. The more reliably it does its job (and nothing else) the more trustworthy it is. Trusting simple machines like thermostats involves mostly durability, since they have very few open variables (unbound variables at time of manufacture).
Human-Level AI To make human-level AI trustworthy is very different from creating simple machines because so many variables are unbound at manufacture time. What does trustworthiness mean in this context? We can look at human trustworthiness: Numerous methods exist for ensuring trustworthiness (license to drive, air traffic controller training, certification programs, etc.). We can have the same certification programs for all humans because their principles of operation are shared at multiple levels of detail (biology, sociology, psychology). For an AI this is different because the variability in the makeup of the machines is enormous. This makes trustworthiness of AI robots a complex issue.
To Achieve Trustworthiness Requires reliability, and predictability at multiple levels of operation. Trustworthiness can be ascertained through special certification programs geared directly at the kind of robot/AI system in question (kind of like certifying a particular horse as safe for a particular circumstance and purpose, e.g. horseback riding kids).
Trustworthiness Methods For AI are in their infancy.



IN-CLASS GROUP ASSIGNMENT



In groups of 2, answer the following questions:


Tust & Explanation

Can Trust Exist Without Explanation? What are the fundamentals of Trust?
What role does explanation play in creating trust?



Tust & Ethics

Can Ethics Exist Without Trust? What role does trust play in ethics?
Can an untrustworthy agent operate ethically? Why / why not?



2025©K.R.Thórisson

/var/www/cadia.ru.is/wiki/data/attic/public/t-709-aies-2025/aies-2025/trust_explanation_meaning.1756196635.txt.gz · Last modified: 2025/08/26 08:23 by thorisson

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki