[[/public:t-720-atai:atai-22:main|T-720-ATAI-2022 Main]] \\ [[/public:t-720-atai:atai-22:Lecture_Notes|Links to Lecture Notes]] \\ \\ \\ =====SYMBOLS, MODELS, CAUSALITY: Knowledge Representation===== \\ \\ \\ ==== Foundational Concepts ==== | Data | Measurement. | | Information | Data that can be / is used or formatted for a purpose. | | Knowledge | A set of interlinked information that can be used to plan, produce action, and interpret new information. | | Thought | The drive- and goal-driven processes of a situated knowledge-based agent. | \\ ====Representation==== | \\ What it is | A way to encode data/measurements. \\ A representation is what you have when you pick something to stand for something else, like the lines and squiggles forming the word "cup" used in particular contexts to **represent** (implicate, point to) an object with some features and properties. | | //All knowledge \\ used for \\ intelligent action \\ must contain \\ representation// | \\ This follows from the facts that \\ (a) anything that references something but is //not// that thing itself contains some sort //representation of//, that thing; \\ (b) knowledge that does not inform action - i.e. cannot be used to //control// - is not knowledge, and \\ ( c) 'intelligence' requires consistent and persistent //informed action.// | | \\ What it Involves | A particular process (computation, thought) is given a particular pattern (e.g. the text "cup" or the word "cup" uttered -- or simply by the form of the light falling on a retina, at a particular time in a particular context) that acts as a "pointer" to an //internal representation//, an information structure that is rich enough to answer questions about a particular phenomenon that this "pointer" pattern points to, without having to perform any other action than to manipulate that information structure in particular ways. | | \\ Why it is Important | **Mathematically**: \\ With the amount of information in the physical world vastly outnumbering the ability of any system to store it all in a lookup table, methods for information storage and retrieval with greater compression are needed. \\ **Historically**: \\ - The founding fathers of AI spoke frequently of //representations// in the first three decades of AI research. \\ - //Skinnerian psychology// and //Brooksian AI// are both "representation-free" methodologies in that they de-emphasized representation (but failed to offer theory that eliminated them). Brooksian AI largely "outlawed" the concept of representation from AI from the mid-80s onward. \\ - Post 2000s: The rise of ANNs has helped continue this trend. | | \\ \\ **Good \\ Regulator \\ Theorem** | \\ Meanwhile, Conant & Ashby's //Good Regulator Theorem// proved (yes, //proved//) that \\ \\ //every good controller ("regulator") of a system **must** be a **model** of that system//. \\ \\ [[http://pespmc1.vub.ac.be/books/Conant_Ashby.pdf|Good Regulator Paper]] \\ \\ | | Why That Matters | A //model// is by definition a //representation// (of the thing that it is a model of) - an information structure. \\ This means that any theory of intelligence //**must**// not only rely on the concept of representation, it also //**must**// rely on the concept of models. | | \\ Bottom Line | AGI is unlikely to be achieved without sophisticated methods for representation of complex things, and sophisticated methods for their creation, manipulation, and management. \\ **This is the role of a //cognitive architecture.//** | \\ ==== Symbols ==== | \\ What are Symbols? | Peirce's Theory of Semiotics (signs) proposes 3 parts to a sign: \\ **a //sign/symbol//, an //object//, and an //interpretant//**. \\ Example of symbol: an arbitrary pattern, e.g. a written word (with acceptable error ranges whose threshold determine when it is either 'uninterpretable' or 'inseparable from other symbols'). \\ Example of object: an automobile (clustering of atoms in certain ways). \\ Example of interpretant: Your mind as it experiences something in your mind's eye when you read the word "automobile". The last part is the most complex thing, because obviously what //you// see and what //I// see when we read the word "automobile" will never be exactly the same. | | Do Symbols Carry Meaning Directly? | No. Symbols are initially meaningless arbitrary patterns, and without an interpretant they are also meaningless. \\ What gives them the ability to //carry// meaning (see below) is a mutual //contract// between two communicators (or, more strictly, and encoding-decoding process pair). | | \\ "Symbol" | Peirce used various terms for this, including "sign", "representamen", "representation", and "ground". Others have suggested "sign-vehicle". What is meant in all cases is that a pattern that can be used to stand for something else, and thus requires an interpretation to be used as such. | | Peirce's Innovation | Detaching the symbol/sign from the object it signified, and introducing the interpretation process as a key entity. This makes it possible to explain why people misunderstand each other, and how symbols and meaning can grow and change in a culture. | \\ ==== Where the Symbols 'Are' ==== | Words | A word - e.g. "chair" - is not the symbol itself; the word is a **token** that can //act// as symbol (to someone, in some circumstance). | | An Example | "That is not a chair!" vs. "Ahh.. that is a //great// chair!" | | Symbolic Role | Being a "symbol" means serving a **function**. In this case the token stands as a "pointer" to **information structures**. | | Tokens as Symbols | The association of a token with a set of information structures is //arbitrary// - if we agree to call "chairs" something else, e.g. "blibbeldyblabb", well, then that's what we call "chairs" from now on. "Go ahead, take a seat on the bibbeldyblabb over there". | | \\ Temporary vs. Permanent \\ Symbols | As you can see in the blibbeldyblabb example, we can associate arbitrary patterns with "thoughts" temporarily, just for fun. In this case we took something familiar (chair) and associated a new pattern ("blibbeldyblabb") with it, replacing an old one ("chair"). Because it was familiar, this was very easy to do. \\ When we see something completely unfamilar and someone says "oh, that's just a blibbleldyblabb" it usually takes longer to know the scope and usage of that new "concept" that we are learning - even thought using the name is easy, it may take a while to learn how to use it properly. \\ A culture can use a coherent set of symbols (with compositional rules we call 'grammar') because they get established through mutual coordination over long periods of time. This is also why languages change: Because people change and they change their usage of language also. | | \\ Context | Using the token, these information structures can be collected and used. But their ultimate meaning depends on the **context** of the token's use. \\ When you use a token, **which information structures** are rounded up, and how they are used, depends on more than the token... | | \\ What Are These \\ Information Structures? | They have to do with all sorts of **experience** of the world. \\ In the case of chairs this would be experience collected, compressed, abstracted and generalized from indoor environments in relation to the physical object we refer to as 'chair' (//a lot// of information could be relevant at any point in time - color, shape, size, usage, manufacturing, destruction, material properties, compositions into parts, ... the list is very long! - which ones are relevant //right now// depends on the //context//, and context is determined primarily by the current state and which //goals// are currently active at this moment). | | Models & Symbols | Both are representations - but //models contain more than symbols;// if a symbol is a kind of **pointer**, a model is **machine-manipulatable instructions**. It is the role of the cognitive machinery to read and manipulate those; it is the role of a learner to produce new ones of those. | \\ ====Meaning==== | \\ Meaning | Philosophers are still grappling with the topic of "meaning", and it is far from settled. It is highly relevant to AI, especially GMI - a GMI that cannot extract the meaning of a joke, threat, promise, or explanation - to some level or extent - is hardly worthy of its label. | | \\ Current Constructivist Approach | Meaning rests on many principles. Two main ones that could be called "pillars" are **context** (the assumed steady-state of a particular situation and the physical forces at play) and **prediction** (implications of these for subsequent steady-states, and relations to the involved agent's goals - esp. those of the agent doing the prediction). This is captured in models (e.g. Drescher's schemas) that form a graph. The meaning, then is captured in ... \\ Firstly, acquired and tested //models// that form a graph of relations; the comprehensiveness of this graph determines the level of understanding that the models can support with respect to a particular phenomenon. \\ This means that //meaning cannot be generated without (some level of) understanding//. We will get back to this later. \\ Secondly, meaning relies on the //context// of the usage of symbols, where the context is provided by (a) who/what uses the symbols, (b) in what particular task-environment, using ( c) particular //syntactic constraints//. | | \\ Production of Meaning | Meaning is produced //on demand//, based on the //tokens// used and //contextual// data. If you are interpreting language, //syntax// also matters (syntax is a system of rules that allows serialization of tokens.) \\ How important is context? Seeing a rock roll down a hill and crushing people is very different if you are watching a cartoon than if you're in the physical world. This is pretty obvious when you think about it. | | \\ Example of Meaning Production | In the case of a chair, you can use information about the **functional aspects** of the chairs you have experienced if you're, say, in the forest and your friend points to a flat rock and says "There's a chair". Your knowledge of a chair's function allows you to **understand** the **meaning** of your friend's utterance. You can access the material properties of chairs to understand what it means when your friend says "The other day the class bully threw a chair at me". And you can access the morphological properties of chairs to understand your friend when she says "Those modern chairs - they are so 'cool' you can hardly sit in them." | | \\ Communication Process | When two agents communicate there are three parts that create meaning: The encoding process of the meaning communicator, the resulting //message//, and the interpreter or //decoding communicator//. Getting the message from one agent to the other is called the //transmission//. However, the "meaning" does not "reside in the message", it emerges from (is **generated** during) the whole **encoding-transmission-decoding process**. | | Syntactic Constraints | Rules that a decoder-encoder pair share about how symbols are allowed to be strung together to create relational graphs which may have particular purposes in particular situations. | | \\ Prerequisites for using symbols | A prerequisite for communication are thus shared knowledge (object - referred concepts, i.e. //sets// of models), and shared encoding and interpretation methods: How syntax is used. And last but not least, shared //cultural// methods for how to handle //context// (background knowledge) including missing information. \\ This last point has to do with the tradeoff between compactness and potential for misunderstanding (the more compact, the more there is a danger of misinterpretation; the less compact, the longer it takes to communicate. | | \\ What About 'Signs from the Gods'? | Is stormy ocean weather a "sign" that you should not go rowing in your tiny boat? \\ No, not directly, but the situation makes use of exactly the same machinery: The weather is a pattern. That pattern has implications for your goals - in particular, your goal to live, which would be prevented if you were to drown. Stormy weather has potential for drowning you. When someone says "this weather is dangerous" the implication is the same as looking out and seeing it for yourself, except that the //arbitrary patterns// of speech are involved in the first instance, but not the second. | | \\ Prediction Creates Meaning | Hearing the words "stormy weather" or seeing the raging storm, your models allow you to make predictions. These predictions are compared to your active goals to see if any of them will be prevented; if so, the storm may make you stay at home. In which case its meaning was a 'threat to your survival'. \\ In the case where you //really really// want to go rowing, even stormy weather may not suffice to keep you at home - depending on your character or state of mind you may be prone to make that risk tradeoff in different ways. | | \\ Models | When interpreting symbols, syntax and context, information structures are collected and put together to form //composite models// that can be used for computing the meaning. By //meaning// we really mean (no pun intended) the //implications// encapsulated in the //**now**//: What may come next and how can goals be impacted? | \\ ====So, What Are Models?==== | \\ Model | A model of something is an information structure that behaves in some ways like the thing being modeled. \\ ‘Model’ here actually means exactly the same as the word when used in the vernacular — look up any dictionary definition and that is what it means. A model of //something// is not the thing itself, it is in some way a ‘mirror image’ of it, typically with some unimportant details removed, and represented in a way that allows for various //manipulations// for the purpose of //making predictions// (//answering questions//), where the form of allowed manipulations are particular to the representation of the model and the questions to be answered. | | \\ Example | A model of Earth sits on a shelf in my daughter’s room. With it I can answer questions about the gross layout of continents, and names assigned to various regions (as they were around 1977 - because that’s when I got it :-) ). A model requires a //process for using// it. In this example that process is humans that can read and manipulate smallish objects. \\ The nature, design, and limitations of these processes determine in part what can be done with the models. | | Computational Models | A typical type of question to be answered with computational (mathematical) models are what-if questions, and a typical method of manipulation is running simulations (producing deductions). Along with this we need the appropriate computational machine. | | \\ Model (again) | A 'model' in this conception has a target phenomenon that it applies to, and it has a form of representation, comprehensiveness, and level of detail; these are the primary features that determine what a model is good for. A computational model of the world in raw machine-readable form is not very efficient for a human to quickly identify all the countries adjacent to Switzerland - for that a traditional globe is much better. For a machine, the exact opposite is true. Which means: The available //mechanisms for manipulating the models// matters! | \\ ==== Symbols, Models, Syntax ==== | What Now? | Here comes some "glue" for connecting the above concepts, ideas, and claims in a way that unifies it into a coherent story that explains intelligence. | | \\ \\ Knowledge | Knowledge is "actionable information" - information structures that can be used to //do stuff//, including \\ (a) predict (mostly deduce, but also abduce), \\ (b) derive potential causes (abduce - like Sherlock Holmes does), \\ ( c) explain (abduce), and \\ (d) re-create (like Einstein did with E=mc2 and the Sims do in software). | | \\ Knowledge \\ = \\ Models | Sets of models allow a thinking agent to do the above, by \\ (a) finding the relevant models for anything (given a certain situation and active goals), \\ (b) apply them according to the goals to derive predictions, \\ ( c) selecting the right actions based on these predictions such that the goals can be achieved, and \\ (d) monitoring the outcome. \\ (Learning then results from correcting the models that predicted incorrectly.) | | \\ What's Contained \\ in Models? | To work as building blokcs for knowledge, models must, on their own or in sets, capture in some way: \\ - Patterns \\ - Relations \\ - Volitional acts \\ - Causal chains | | Where Do The Symbols Come In? | Symbols are mechanisms for rounding up model sets - they are "handles" on the information structures. \\ In humans this "rounding up" happens subconsciously and automatically, most of the time, using similarity mapping (content-driven association). | | \\ Syntactic Autonomy | To enable autonomous thought, the use of symbols for managing huge sets of models must follow certain rules. For determining the development of biological agents, these rules - their syntax - must exist in form //a priori// of the developing, learning mind (encoded in DNA), because it determines what these symbols can and cannot do, from early infant life to more grown-up stages. In this sense, "syntax" means the "rules of management" of information structures (just like the use of symbols in human communication). | | \\ Historical Note | Chomsky claimed that humans are born with a "language acquisition device". \\ What may be the case is that the language simply sits on top of a more general set of "devices" for the formation of knowledge //in general//. | | \\ Evolution & Cognition | Because thought depends on underlying biological structures, and because biological structure depends on ongoing maintenance processes, the syntax and semantics for creating a biological agent, and the syntax and semantics for generating meaningful thought in such an agent, both depend on //syntactic autonomy// - i.e. rules that determine how the referential processes of **encode-transmit-decode** work. | \\ ====Closure==== | \\ Operational Closure | \\ What it is | The ability of a system to map input from the environment to output in pursuit of its purpose autonomously - i.e. its //autonomous operation//. \\ In terms of goal-directed learning controllers, it is the ability to create and appropriately apply **efficient-cause models**. | | | Why it is important | Life and cognition depend on it. | | Organizational Closure | What it is | The ability of a system to maintain its own structure autonomously, in light of perturbations. | | | Why it is important | Life depends on it. | | | Subsumed By | Operational closure. | | Semantic Closure | What it is | The ability of a system to generate meaning autonomously for itself. \\ For a goal-directed learning controller this is the ability to model self and environment. | | | Why it is important | Cognition depends on it. | | | Subsumed By | Organizational closure. | \\ ====Reasoning==== | What It Is | The establishment of axioms for the world and applying logic to these. | | Depends On | Semantic closure. | | But The World Is Non-Axiomatic ! | Yes. But there is no way to apply logic unless we hypothesize some pseudo-axioms. The only difference between this and mathematics is that in science we must accept that the so-called "laws" of physics may be only conditionally correct (or possibly even completely incorrect, in light of our goal of figuring out the "ultimate" truth about how the universe works). | | Deduction | Results of two statements that logically are necessarily true. \\ //Example: If it's true that all swans are white, and Joe is a swan, then Joe must be white//. | | Abduction | Reasoning from conclusions to causes. \\ //Example: If the light is on, and it was off just a minute ago, someone must have flipped the switch//. | | Induction | Generalization from observation. \\ //Example: All the swans I have ever seen have been white, hence I theorize that all swans are white//. | \\ ====System & Architectural Requirements for Using Models==== | Model Acquisition | ≈ model generation: The ability to create models of (observed) phenomena. | | Effectiveness | Creation of models must be effective - otherwise a system will spend too much time creating useless or bad models. \\ Making the model creation effective may require e.g. parallelizing the execution of operations on them. | | Efficiency | Operations on models listed above must be efficient lest they interfere with the normal operation of the system / agent. \\ One way to achieve temporal efficiency is to parallelize their execution, and make them simple. | | Scalability | For any moderately interesting / complex environment, a vast number of models may be entertained and considered at any point in time, and thus a large set of //potential// models must be manipulatable by the system / agent. | \\ \\ \\ \\ \\ //2022(c)K.R.Thórisson//