[[public:t-720-atai:atai-22:main|T-720-ATAI-2022 Main]] \\
[[public:t-720-atai:atai-22:Lecture_Notes|Links to Lecture Notes]]


\\
\\
======UNDERSTANDING: Understanding, Curiosity, Creativity======
\\
\\
\\

===== Curiosity ======

\\
\\

====What is Curiosity?====

|  What It Is  | The tendency of a learner to seek out novel inputs that may not have any relevance to its currently active goals.    |
|  Why It Matters  | Curiosity may  be an inherent/inevitable feature of all intelligent systems that live in an uncertain environment: Because one of their top-level goals will always be self-preservation, and because they cannot fully predict what threats to their existence the future may hold, they are forced to collect information which //might// become useful at a later time.   |
|  Other Meanings  | We sometimes call people "curious" who keep sticking their nose into things which may be relevant (or even irrelevant) to them but which societal norms consider outside their obvious "permissible" range of access - this is a different, more anthropocentric side of 'curiosity' which is less interesting for our purposes.       |

\\

====Why Curiosity?====

|  Why Are We Talking About Curiosity?  | \\ Curiosity is a really great term for a very complex systemic phenomenon of significant importance to AGI: Motivation.      |
|  \\ Motivation  | A learner without "internal motivation" will not have any reason to learn anything - we call it 'internal motivation' because it is a mechanism (complex or simple) of the cognitive architecture itself (without which "nothing would happen"), that gives an agent a tendency to act in a certain way in certain circumstances (and possibly: in general).      |
|  \\ How is Motivation Programmed? \\ \\ **Drives**  | Fundamental motivation is not something that a learner can learn (unless we assume that as it's "born" there is something in the environment to program that in; assuming something that highly-specific exists and is available in the environment is not a good strategy for ensuring survival if the creature is intended to grow cognitively in a predictable way). \\ The way that nature does this is to provide newborns with some sort of impetus to act in certain ways in certain situations, e.g. cry when hungry. This works most of the time because all living creatures have parents. \\ We call internal motivational factors //**drives**//.       |
|  \\ Baby Machines  | General learners can learn over their lifetime vastly larger amounts of knowledge than they are born with. Such machines are sometimes called 'baby machines'. The drives of baby machines typically must change over their lifetime, especially if they are very good and general learners. \\ In psychology this is called //cognitive development//. \\ Very few - if any - AI systems exist that have demonstrated such a capability.[1] But some form of cognitive development is probably unavoidable in any powerful learning scheme, because what motivational mechanisms you need when you know very little are likely to be very different from useful motivational mechanisms that work well when you know a lot (when you have learned most of the fundamental principles of how your world works, your old learning mechanisms are unlikely to be as efficient or relevant as they were in the beginning).        |
|  Footnote | [1] Mind you, it should not be too hard to create a system that //appears// to demonstrate cognitive development, just as it isn't difficult to write a for-loop called "thinking". The mechanisms demonstrated in a real cog-dev system should also demonstrate the //need// for such a capacity, and that they happen //autonomously// in the learning process that the system implements.        |
\\
\\
===== Creativity ======

\\
\\

====What is Creativity?====

|  \\ The Word  | The word 'creativity' has many meanings. \\ The simplest meaning is typically that "you're creative if you think of something that nobody else thought of". \\ A better meaning in our context is the ability of an intelligent system to produce non-obvious solutions to problems. \\ Creativity is about **producing** something.   | 
|  Why it's Important  | Ultimately we want creative machines. It is difficult to tease apart the concepts of intelligence and creativity: It is hard to imagine a great intelligence that is not creative. Likewise, it is also difficult to imagine a creative agent that is also not intelligent.  |
|  Creativity Without Intelligence?  | The relation between creativity may not be bijective: while it is difficult to imagine a highly intelligent system that is not creative, it is not AS difficult to imagine an (artificial) system that is creative but not intelligent. This is especially true if we assume there are other (natural) intelligences around to make sense of what this "non-intelligent creative system" produces.      |
|  Are Only Artists Creative?  | Short answer: No. \\ Longer answer: The word "creativity" has many meanings, and is used in everyday language in numerous ways.     |
|  \\ How it is Measured  | Creativity is always measured with respect to some goal: If I just "do something" how could anyone tell if I am creative? It is only when I tell you what my goal was (and even better, if I show you what others did with respect to that goal) that you can say for sure whether what I did qualifies as "creative" in your view. (Jackson Pollock was not creative because he splattered paint onto canvas, his work was creative because of the context in which it was done.) \\ Is creativity always subjective? It probably doesn't have to be, but until we have good theories of goals, actions, and tasks, it will probably remain a rather loose concept.    |

\\

==== Creativity & AI ====

|  Examples of creative machines  | Do good examples of creative machines exist?    |
|  Aaron  | http://prostheticknowledge.tumblr.com/post/20734326468/aaron-the-first-artificial-intelligence-creative  \\ https://www.youtube.com/watch?v=3PA-XApZkso   |
|  \\ Thaler's \\ Creativity Machine  | http://www.imagination-engines.com \\ CM patented in 1994 \\ A few years later: the CM makes an invention that gets a patent from the United States Patent & Trademark Office (USPTO) \\ [[http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F5659666|CM patent]] \\ What it is: ANN, becomes "creative" by "relaxing some parameters" so that the ANN "begins to hallucinate".   |
|  Are these machines creative \\ ?  | Maybe - in some sense of the concept. \\ Are they //truly// creative? Probably not. \\ How so? What does it mean to be "truly creative"? \\ That would be the **full monty**: The ability to see unique, valuable solutions to a wide range of challenging problems better than others.    |
|  Creativity is \\ a Relative Term  | It is somewhat unavoidable to interpret the concept of 'creativity' as a relative term - i.e. "person (or system) X is more creative than person (or system) Y", as no absolute scale for it exists, as of yet. (It is possible that AI / AGI research may one day develop such a scale.)   |
|  Intermediate Conclusion  | To answer the question "Do creative machines exist?" we must inspect the concept of creativity in more detail.    |

\\

==== Theories & Definitions ====

|  \\ Attempts at Definitions  | **1.** In its simplest sense it is the //ability to produce solutions to problems//. - This meaning treats it as a single continuous dimension (or many that may be collapsed into one) along which we simply put a threshold for when we will classify something as "creative". \\ **2.** A more complex version references in some way the complexity of a problem, such that //solutions that address the problem in a better way (other things being equal) or achieve a similar solution with less cost// (other things being equal) are more //creative// than others. \\ **3.** In reference to some sort of "obviousness", //a solution to a problem may be more creative if it is "less obvious"//, with respect to some population, time, society, education, etc. In this case a more creative agent is one that repeatedly uncovers solutions that are "less obvious" - even to itself. This definition is relative to the knowledge it operates on (which is a good thing, because it removes the reference to background knowledge): Out of a set of processes <m>P</m> that can produce solutions to problems from a set of knowledge <m>K</m>, the process <m>p \in P</m> that reliably and repeatedly uncovers valid solutions with a small or no intersection with the output from the others is "more creative" than the others. \\ (The trouble with this approach is that the difference between these processes might also be construed as knowledge, in which case we cannot in principle keep the knowledge constant.)    |
|  \\ Schmidhuber's theory of creativity  | [[http://people.idsia.ch/~juergen/creativity.html|Schmidhuber's theory of creativity]].  \\ The observer's //learning process causes a reduction of the subjective complexity of the data, yielding a temporarily high derivative of subjective beauty:// a temporarily steep learning curve.  \\ The current predictor / compressor of the observer or data creator tries to compress his history of (e.g. acoustic and other) inputs where possible (whatever you can predict you can compress as you don't have to store it extra). The action selector tries to find history-influencing actions such that the continually-growing historic data allows for improving the performance of the predictor / compressor. The interesting or aesthetically rewarding (e.g. musical and other) sub-sequences are precisely those with previously unknown yet learnable types of regularities, because they lead to compressor improvements. The boring patterns are those that are either already perfectly known or arbitrary or random, or whose structure seems too hard to understand.     |

\\

==== Creativity & Understanding ====


|  Understanding  | To consistently solve problems regarding a phenomenon <m>X</m> requires //understanding // <m>X</m>. \\ Understanding <m>X</m> means the ability to extract and analyze the //meaning// of any phenomena <m>\phi</m> related to <m>X</m>.    |
|  Meaning  | Meaning is closely coupled with understanding -- the two cannot exist without the other. Are they irreducible?   |
|  Bottom Line  | Can't talk about creativity without talking about understanding, and can't talk about understanding without talking about meaning. No good scientific theory of meaning (but some philosophical ones) exists.   |

\\

====Reasoning====

|  What It Is  | The establishment of axioms for the world and applying logic to these. \\ (Creating reasonable assumption and using these as a basis for applying logic according to given rules.)   |
|  \\ But The World Is Non-Axiomatic !  | Yes. But there is no way to apply logic unless we hypothesize some pseudo-axioms. The only difference between this and mathematics is that in science we must accept that the so-called "laws" of physics may be only conditionally correct (or possibly even completely incorrect, in light of our goal of figuring out the "ultimate" truth about how the universe works).     |
|  Deduction  | Results of two statements that logically are necessarily true. \\ //Example: If it's true that all swans are white, and Joe is a swan, then Joe must be white//.    |
|  \\ Abduction  | Reasoning from conclusions to (likely) causes. \\ //Example: If the light is now on, but it was off just a minute ago, someone must have flipped the switch//.  \\ Note that in the reverse case different abductions may be entertained, because of the way the world works: //If the light is off now, and it was on just a minute ago, someone may have flipped the switch OR a fuse may have been blown.//    |
|  \\ Induction  | Generalization from observation. \\ //Example: All the swans I have ever seen have been white, hence I hypothesize that all swans are white//. \\ Induced knowledge can always be refuted by new evidence. This is the general principle of empirical science.    |
|  Analogy  | The ability to find similarity between even the most disparate phenomena.  |
|  \\ Why This Matters  | Logic is one of the most effective ways to compress information. Reasoning is the process of applying logic to information according to rules. Because of the high ratio of possible states in the physical world to the storage capacity of the human (and machine) mind/memory, it is not conceivable that understanding ("//true// knowledge" - i.e. useful, reliable knowledge) of a large amount of phenomena in the physical world can be achieved without the use of reasoning.   | 
|  \\ How is Reasoning Applied in Understanding & Creativity?  | Based on knowledge about objects, parts and relations between these, as well as transformation rules by which these behave and interact, one can \\ \\ - construct predictions for what will happen next (predictive control), \\ - abduce what may have happened before (how the world got to where it is - constructing explanations), \\ - determine what to do next (make plans), and \\ - make analogies - drawing parallels to create hypotheses about novel things. \\ \\ It is in particular this last item (not in isolation with the others but in tandem with them) that is a very useful tool for producing novel insights (i.e. "being creative").     |

\\


===== Understanding =====
\\

\\
====In the Vernacular====
|  What It Is  | A concept that people use all the time about each other's cognition. With respect to achieving a task, given that the target of the understanding is all or some aspects of the task, more of it is generally considered better than less of it.      |
|  Why It Is Important  | Seems to be connected to "real intelligence" - when a machine does X reliably and repeatedly we say that it is "capable" of doing <m>X</m> qualify it with "... but it doesn't 'really' understand what it's doing".   | 
|  What Does It Mean?  | No well-known scientific theory exists. \\ Normally we do not hand control of anything over to anyone who doesn't understand it. All other things being equal, this is a recipe for disaster.  | 
|  Evaluating Understanding  | Understanding any <m>X</m> can be evaluated along four dimensions: \\ 1. Being able to predict <m>X</m>, \\ 2. being able to achieve goals with respect to <m>X</m>, \\ 3. being able to explain <m>X</m>, and \\ 4. being able to "re-create" <m>X</m> \\ ("re-create" here means e.g. creating a simulation that produces X and many or all its side-effects.)    | 

\\

====It Used To Be Called "Common Sense" (in AI circles)====

|  Status of Understanding in AI  | Since the 70s the concept of //understanding// has been relegated to the fringes of AI research. The only AI contexts the term regularly appears in are "language understanding", "scene understanding" and "image understanding".    | 
|  What Took Its Place  | What took the place of understanding in AI is //common sense//. Unfortunately the concept of common sense does not necessarily capture at all what we generally mean by "understanding".   |
|  Projects  | The best known project on common sense is the CYC project, which started in the 80s and is apparently still going. It is the best funded, longest running AI project in history.    |
|  Main Methodology  | The foundation of CYC is formal logic, represented in predicate logic statements and compound structures.   |
|  Key Results  | Results from the CYC project are similar to the expert systems of the 80s - these systems are brittle and unpredictable. \\ The state of the CYC system in 2016 is provided in this nicely written essay: [[https://www.technologyreview.com/s/600984/an-ai-with-30-years-worth-of-knowledge-finally-goes-to-work/|REF]].  |
|  \\ Two Problems  | Upon further scrutiny, no good analysis or arguments exist of why 'understanding' should be equated with 'common sense'. The two are simply not the same thing. \\ Furthermore, progress under the rubric of 'common sense' in AI has neither produced any grand results nor evidence that the methodology followed is a promising one. And it certainly doesn't seem to have inspired fresh ideas in a very long time.   |
\\

==== A Scientific Theory of Understanding ====

|  \\ Why Do We Need One \\ ?  | Normally we do not hand over control of anything to anyone who //doesn't understand it//. Other things being equal, this is a recipe for disaster. \\ If we want machines to control ever-more complex processes, we //must// give them some level of understanding of what they are doing. \\ We need to build systems that we can trust. \\ We cannot trust an agent that doesn't understand what its doing or the context it operates in.      |
|  Cumulative Understanding  | A scientific theory of understanding proposed by K.R.Thórisson et al. \\ [[http://alumni.media.mit.edu/~kris/ftp/AGI16_understanding.pdf|About Understanding]] by Thórisson et al.  |
|  Why a Scientific Theory \\ ?  | A scientific theory, as opposed to a philosophical one, proposes actionable principles for //how to construct the phenomenon in question//.  \\ An AI system is a //control// system: It affects the world in some way.    |
|  What Does It Mean \\ for AI?  | Understanding is a prerequisite for being trustworthy. AI that //truly understands// has met this prerequisite. With a scientific theory of understanding can build //trustworthy AI//.      | 

\\

====Theory of Cumulative Understanding====

|  What It Is  | The only theory of understanding in the field of AI; only scientific theory of understanding (AFAIK).     |
|  In a Nutshell  | Understanding involves the manipulation of causal-relational models (e.g. those in the AERA AGI-aspiring architecture - see next set of lecture notes).   | 
|  \\ Phenomenon \\ <-> \\ Model  | Phenomenon <m>Phi</m>: Any group of inter-related variables in the world, some or all of which can be measured.\\ Models <m>M</m>: A set of information structures that reference the variables of <m>Phi</m> and their relations <m>R</m> such that they can be used, applying processes <m>P</m> that manipulate <m>M</m>, to \\ (a) predict, \\ (b) achieve goals with respect to, \\ (c ) explain, and \\ (d) (re-)create <m>Phi</m>.    | 
|  \\ Definition of Understanding  | An agent's **understanding** of a phenomenon <m>Phi</m> to some level <m>L</m> when it posses a set of models <m>M</m> and relevant processes <m>P</m> such that it can use <m>M</m> to \\ (a) predict, \\ (b) achieve goals with respect to, \\ (c ) explain, and \\ (d) (re-)create <m>Phi</m>. \\ \\ Insofar as the nature of relations between variables in <m>Phi</m> determine their behavior, the level <m>L</m> to which the phenomenon <m>Phi</m> is understood by the agent is determined by the //completeness// and the //accuracy// to which <m>M</m> matches the variables and their relations in <m>Phi.</m>    | 
|  \\ Why \\ 'Cumulative' \\ ?  | According to the theory, 'understanding' is a learning process: A dynamic process that creates knowledge of a particular functional kind, namely the kind that can be used to predict, achieve goals, explain and (re-)create (see the point above in this table). \\ Unlike 'learning', however, in the vernacular the term 'understanding' is also used to describe a //state// - the state reached when the process completes (the output of 'learning' is 'knowledge' - the output of the 'understanding' is '//an// understanding'). We can see the difference if we replace 'understanding' with the term 'know': "A understands X" vs. "A //knows// X".  \\ The term 'cumulative' emphasizes the 'learning' part of understanding - a //process of creation//. \\ This generation process is key to human understanding because \\ (a) it is used many times a day - every time we learn anything new: It is an integral part of the //learning process// proper, and \\ (b) it a //general// process of knowledge acquisition that builds knowledge graphs incrementally, from experience, relying on reasoning processes as well as existing knowledge, and \\ ( c) the processes that //build// the knowledge and those that //use it// are the //**same **// processes.    |
|  REF  | [[http://alumni.media.mit.edu/~kris/ftp/AGI16_understanding.pdf|About Understanding]] by Thórisson et al.  |
\\


==== Self-Explaining Systems ====

|  What It Is  | The ability of a controller to explain, after the fact or before, why it did something or intends to do it.   |
|  'Explainability' \\ ≠ \\ 'self-explanation'  | If an intelligence X can explain a phenomenon Y, Y is 'explainable' by Y, through some process chosen by Y. \\ \\ In contrast, if an intelligence X can explain itself, its own actions, knowledge, understanding, beliefs, and reasoning, it is capable of self-explanation. The latter is stronger and subsumes the former.   |
|  Why It Is Important  | If a controller does something we don't want it to repeat - e.g. crash an airplane full of people (in simulation mode, hopefully!) - it needs to be able to explain why it did what it did. If it can't, it means it - and //we// - can never be sure of why it did what it did, whether it had any other choice, whether it is likely to do it again, whether it's an evil machine that actually meant to do it, or even how likely it is to do it again.     |
|  \\ Human-Level AI  | Even more importantly, to grow and learn and self-inspect the AI system must be able to sort out causal chains. If it can't it will not only be incapable of explaining to others why it is like it is, it will be incapable of explaining to itself why things are the way they are, and thus, it will be incapable of sorting out whether something it did is better for its own growth than something else. Explanation is the big black hole of ANNs: In principle ANNs are black boxes, and thus they are in principle unexplainable - whether to themselves or others. \\ One way to address this is by encapsulating knowledge as hierarchical models that are built up over time, and can be de-constructed at any time (like AERA does).   |

\\
\\


\\
\\
\\
\\

//2022(c)K.R.Thórisson//