User Tools

Site Tools


public:t-720-atai:atai-19:lecture_notes_w7

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

public:t-720-atai:atai-19:lecture_notes_w7 [2019/09/10 14:58]
thorisson removed
public:t-720-atai:atai-19:lecture_notes_w7 [2019/09/10 15:02] (current)
thorisson created
Line 1: Line 1:
-[[http://cadia.ru.is/wiki/public:t-720-atai:atai-19:main|T-720-ATAI-2019 Main]] \\ + 
-[[http://cadia.ru.is/wiki/public:t-720-atai:atai-19:Lecture_Notes|Links to Lecture Notes]]+ 
 +[[public:t-720-atai:atai-19:main|T-720-ATAI-2019 Main]] \\ 
 +[[public:t-720-atai:atai-19:Lecture_Notes|Links to Lecture Notes]]
  
  
  
 =====T-720-ATAI-2019===== =====T-720-ATAI-2019=====
-====Lecture Notes, W8Knowledge Representation, Reasoning, Understanding, Meaning====+====Lecture Notes, W7Evaluation====
 \\ \\
 \\ \\
Line 11: Line 13:
  
 --------------- ---------------
 +=====Evaluation of Intelligent Systems=====
  
 \\ \\
 \\ \\
  
-====Important Concepts for AGI====+====Sources of Evaluation Methods==== 
 +|  **Psychology**  | Uses tests based on a single measure at a single point in time. Produces a single "IQ" score.  || 
 +|     | Method  | Creates a set of test items that can be assigned to a sample pool of people at various ages and measured on their ability to distinguish them from each other (diversity). Subset of test items selected based on the "largest discriminatory power" and normalized for age groups.  | 
 +|     | Pros    | Well established method for human intelligence.    | 
 +|     | Cons    | Present and future AI systems are very different from human intelligence. Worse, the normalization of standard psychometrics for humans isn't possible for AIs because they are not likely to consist of populations of similar AI systems. Even if they did, these methods only provide relative measurements. Another serious problem is that they rely heavily on a subject's prior knowledge and training.   | 
 +|  **AI**  | Board games, robo-football, a handful of toy problems (e.g. mountain car, diving for gold).   || 
 +|      | Method  | Standard board games that humans play are used unmodified or in simplified versions to distinguish between the best AI systems capable of playing these board games.  | 
 +|      | Pros  | Simple tests with a single measure provide unequivocal scores that can be compared. Relatively easy to implement and administer.   | 
 +|      | Cons  | A single dimension to measure intelligence on is too simplistic, subject to the same problems that IQ tests are subject to. All systems in the first 40 years of AI could only play a single board game (the General Game Playing Competition was intended to address this limitation).   | 
 +|  **AGI**  | Turing Test, Piaget-McGyver Room, Lovelace Test, Toy-Box Problem   || 
 +|       | Method  | Human-like conditions extended to apply to intelligent machines.   | 
 +|       | Pros    | Better than single-measure methods in many ways.     | 
 +|       | Cons    | Measure intelligence at a single point in time. Many are difficult to implement and administer.    |
  
 +\\
 +\\
  
 +====Turing Test====
 +|  What it is  | A test for intelligence proposed by Alan Turing in 1950.   |
 +|  Why it's relevant  | The first proposal for how to evaluate an intelligent machine. Proposed as a way to get a pragmatic/working definition of the concept of //intelligence//.  |
 +|  Method  | It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart front the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either "X is A and Y is B" or "X is B and Y is A." We now ask the question, "What will happen when a machine takes the part of A in this game?"  |
 +|  Pros  | It is difficult to imagine an honest, collaborative machine playing this game for several days or months could ever fool a human into thinking it was a grown human unless it really understood a great deal.   |
 +|  Cons  | Targets evaluation at a single point in time. Anchored in human language, social convention and dialogue. The Loebner Prize competition has been running for some decades, offering a large financial prize for the first machine to "pass the Turing Test". None of the competing machines has thus far offered any significant advances in the field of AI, and most certainly not to AGI.  //"It's important to note that Turing never meant for his test to be the official benchmark as to whether a machine or computer program can actually think like a human"// (- Mark Reidl)  | 
 +|  Implementations  | The Loebner Prize competition has been running for some decades, offering a large financial prize for the first machine to "pass the Turing Test". None of the competing machines has thus far offered any significant advances in the field of AI, and most certainly not to AGI.    | 
 +|  Bottom Line  | //"It's important to note that Turing never meant for his test to be the official benchmark as to whether a machine or computer program can actually think like a human"// (- Mark Reidl) |
 +|  Paper  | [[https://chatbotsmagazine.com/how-to-win-a-turing-test-the-loebner-prize-3ac2752250f1|Loebner prize article]]  |
  
 \\ \\
 \\ \\
- +====Piaget-McGyver Room==== 
-====Representation==== +|  What it is   [W]e define a room, the Piaget-MacGyver Room (PMR), which is such that, an [information-processing] artifact can credibly be classified as general-intelligent if and only if it can succeed on any test constructed from the ingredients in this roomNo advance notice is given to the engineers of the artifact in question as to what the test is going to be.    | 
-|  \\ What it is  A representation is what you have when you pick something to stand for something elselike the lines forming the word "cup" used in particular contexts are used to **represent** (implicatepoint to) an object with some features and properties \\ //All knowledge used for intelligent action must have a representation.//   | +|  Why it's relevant  | One of the first attempts at explicitly getting away from specific test or test suite for testing intelligence  
-|  \\ What it Involves  | A particular process (computation, thought) is given a particular pattern (e.g. the text "cup" or the word "cup" uttered -- or simply by the form of the light falling on a retina, at a particular time in a particular context) that acts as a "pointer" to an //internal representation//, an information structure that is rich enough to answer questions about a particular phenomenon that the "pointer" points to, without having to perform any other action than to manipulate that information structure in particular ways.    | +|  REF  | [[http://kryten.mm.rpi.edu/Bringsjord_Licato_PAGI_071512.pdf|Bringsjord & Licato]]   |
-|  \\ Why it is Important  | **Mathematically**: \\ With the amount of information in the physical world vastly outnumbering the ability of any system to store it all in lookup table, methods for information storage and retrieval with greater compression are needed\\ **Historically**: \\ - The founding fathers of AI spoke frequently of //representations// in the first two decades of AI research. \\ - Skinnerian psychology and Brooksian AI -- both "representation-free" methodologies -- largely outlawed the concept of representation from AI in the 80s onward. \\ - Post 2000s the rise of ANNs has helped continue the trend.      +
-|  \\ Good Regulator Theorem  | Meanwhile, Conant & Ashby's //Good Regulator Theorem// proved (yes, //proved//) that \\ \\ //every good controller ("regulator") of a system **must** be a **model** of that system//. \\ \\ [[http://pespmc1.vub.ac.be/books/Conant_Ashby.pdf|Good Regulator Paper]]  | +
-|  Why That Matters  | A //model// is by definition a representation (of the thing that it is a model of).    | +
-|  Bottom Line  | Referring to the first table above, AGI is unlikely without sophisticated methods for representation of complex things, and sophisticated methods for their creation, manipulation, and management. \\ **This is the role of a //cognitive architecture.//**   |+
  
 \\ \\
 \\ \\
  
-====Symbols & Meaning==== +====The Toy Box Problem==== 
-|  What are Symbols?  Peirce's Theory of Semiotics (signs) proposes 3 parts to a sign: a //sign/symbol//, an //object//, and an //interpretant//Example of symbol: an arbitrary pattern, e.g. a written word. Example of object: an automobile. Example of interpretant: what you see in your mind's eye when you read the word "automobile". The last part is the most complex thing, because obviously what you see and I see when we read the word "automobile" is probably not exactly the same.   +|  What it is   A proposal for evaluating the intelligence of an agent 
-|  "Symbol"  | Peirce used various terms for this, including "sign", "representamen", "representation";, and "ground"Others have suggested "sign-vehicle". What is mean in all cases is a pattern that can be used to stand for something else, and thus requires an interpretation to be used as such.   +|  Why it's relevant  | One of several new and novel methods proposed for this purposefocuses on varietynovelty and exploration 
-|  Peirce's Innovation  Detaching the symbol/sign from the object signified, and introducing the interpretation process as a key entityThis makes it possible to explain why people misunderstand each other, and how symbols and meaning can grow and change in a culture   | +|  Method   A robot is given a box of previously unseen toys. The toys vary in shapeappearance and construction materialsSome toys may be entirely unique, some toys may be identical, and yet other toys may share certain characteristics (such as shape or construction materials)The robot has an opportunity to  rst play and experiment with the toysbut is subsequently tested on its knowledge of the toys. It must predict the responses of new interactions with toysand the likely behavior of previously unseen toys made from similar materials or of similar shape or appearance. Furthermoreshould the toy box be emptied onto the floorit must also be able to generate an appropriate sequence of actions to return the toys to the box without causing damage to any toys (or itself) 
- Meaning  | Philosophers are still grappling with the topic of "meaning"and it is far from settled. It is highly relevant to AIespecially AGI - an AGI that cannot extract the meaning of a jokethreat, promise, or explanation is hardly worth its label  +|  Pros  | Includes perception and action explicitlySpecifically designed as stepping stone towards general intelligencea solution to the simplest instances should not require universal or human-like intelligence   
-|  Current Approach  | Meaning stems from two main sourcesFirstly, acquired and tested models form graph of relations; the comprehensiveness of this graph determines the level of understanding that the models can support with respect to a particular phenomenon. Meaning is not possible without (some level of) understanding. Secondly, meaning comes from the context of the usage of symbols, where the context is provided by (a) who/what uses the symbols, (b) in what particular task-environment, using ( c) what particular //syntactic constraints//  +|  Cons  | Limited to single instance in time. Somewhat too limited to dexterity guided by visionmissing out on reasoning, creativity, and many other factors   |  
-|  Prerequisites for using symbols  | A prerequisite for communication is shared interpretation methodshared interpretation of syntax (context), and shared knowledge (object)  |  +|  REF  | [[http://agi-conf.org/2010/wp-content/uploads/2009/06/paper_54.pdf|Johnston]]  |
-|  Where the Symbols "are"  | When we use the term "symbol" in daily conversation we typically are referring to its //meaning//, not its form (sign). The meaning of symbols emerges from the interpretation process which is triggered by the contextual use of a sign: A sign's relation to forward models, in the pragmatic and syntactic context, produces a meaning that which is //signified//. Thus, more than being "stored in a database", symbols are continuously and dynamically being "computed based on knowledge".    | +
-|  Models & Symbols  | Both are representations - but models contain more than symbols; if symbols are pointers models are machines.    |+
  
 \\ \\
 \\ \\
-====So, What Are Models?==== 
  
-|  Model  | A model of something is an information structure that behaves in some ways like the thing being modeled\\ ‘Model’ here actually means exactly the same as the word when used in the vernacular — look up any dictionary defnition and that is what it meansA model of //something// is not the thing itselfit is in some way ‘mirror image’ of it, typically with some unimportant details removed, and represented in way that allows for various //manipulations// for the purpose of //making predictions// (//answering questions//), where the form of allowed manipulations are particular to the representation of the model and the questions to be answered     +====Lovelace Test 2.0==== 
-|  Example  | A model of Earth sits on a shelf in my daughter’s room. With it I can answer questions about the gross layout of continents, and names assigned to various regions as they were around 1977 (because that’s when I got it for my confirmation :-) ) A model requires a //process for using// it. In this example that process is humans that can read and manipulate smallish objects.     +|  What it is  | A proposal for how to evaluate the creativity  | 
-|  Computational Models  | A typical type of question to be answered with computational (mathematical) models are what-if questions, and a typical method of manipulation is running simulations (producing deductions). Along with this we need the appropriate computational machine.    +|  Why it's relevant  | The only test focusing explicitly on creativity  | 
-|  \\ Model (again)  | A 'model' in this conception has a target phenomenon that it applies to, and it has a form of representation, comprehensiveness, and level of detail; these are the primary features that determine what a model is good forA computational model of the world in raw machine-readable form is not very efficient for quickly identifying all the countries adjacent to Switzerland - for that a traditional globe is much better  | +|  Method  | Artificial agent <m>a</m> is challenged as follows: 1. <m>a</m> must create an artifact o of type t; o must conform to a set of constraints C where <m>c_i</m> ∈ C is any criterion expressible in natural language; a human evaluator h, having chosen t and C, is satisfied that o is valid instance of t and meets C; and a human referee r determines the combination of and to not be unrealistic for an average human 
-|  Model Acquisition  | The ability to create models of (observed) phenomena.   |+|  Pros  | Brings creativity to the forefront of intelligence testing   
 +|  Cons  | Narrow focus on creativity. Too restricted to human experience and knowledge (last point).  
 +|  REF  | [[http://arxiv.org/pdf/1410.6142v3.pdf|Reidl]]   |
  
 \\ \\
 \\ \\
-====System & Architectural Requirements for Using Models==== 
-|  Effectiveness  | Creation of models must be effective - otherwise a system will spend too much time creating useless or bad models. \\ Making the model creation effective may require e.g. parallelizing the execution of operations on them.  | 
-|  Efficiency  | Operations on models listed above must be efficient lest they interfere with the normal operation of the system / agent. \\ One way to achieve temporal efficiency is to parallelize their execution, and make them simple.  | 
-|  Scalability  | For any moderately interesting / complex environment, a vast number of models may be entertained and considered at any point in time, and thus a large set of //potential// models must be manipulatable by the system / agent.     | 
  
- + 
 + 
 +====Requirements for Evaluation: Features That Evaluators Should Be Able To Control==== 
 + Determinism  | Both full determinism and partial stochasticity (for realism regarding, e.g. noise, stochastic events, etc.) must be supported.   | 
 +|  Ergodicity  | The reachability of (aspects of) states from others determines the degree to which the agent can undo things and get second chances.  | 
 +|  Continuity  | For evaluation to be relevant to e.g.robotics, it is critical to allow continuous variables, to appropriately represent continu- ous spatial and temporal features. The degree to which continuity is approximated (discretization granularity) should be changeable for any variable.  | 
 +|  Asynchronicity  | Any action in the task-environment, including sensors and controls, may operate on arbitrary time scales and interact at any time, letting an agent respond when it can. | 
 +|  Dynamism  | A static task-environment’s state only changes in response to the AI’s actions. The most simplistic ones are step-lock, where the agent makes one move and the environment responds with another (e.g. board games). More complex environments can be dynamic to various degrees in terms of speed and magnitude, and may be caused by interactions between environmental factors, or simply due to the passage of time.  | 
 +|  Observability  | Task-environments can be partially observable to varying degrees, depending on the type, range, refresh rate, and precision of available sensors, affecting the difficulty and general nature of the task-environment.   | 
 +|  Controllability  | The control that the agent can exercise over the environ- ment to achieve its goals can be partial or full, depending on the capability, type, range, inherent latency, and precision of available actuators.   | 
 +|  Multiple Parallel Causal Chains  | Any generally intelligent system in a complex environment is likely to be trying to meet multiple objectives, that can be co-dependent in various ways through any number of causal chains in the task-environment. Actions, observations, and tasks may occur sequentially or in parallel (at the same time). Needed to implement real- world clock environments.  | 
 +|  Periodicity  | Many structures and events in nature are repetitive to some extent, and therefore contain a (learnable) periodic cycle – e.g. the day-night cycle or blocks of identical houses.   | 
 +|  Repeatability  | Both fully deterministic and partially stochastic environ- ments must be fully repeatable, for traceable transparency.  | 
 +|  REF  | [[http://alumni.media.mit.edu/~kris/ftp/AGIEvaluationFlexibleFramework-ThorissonEtAl2015.pdf|Thorisson, Bieger, Schiffel & Garrett]]   |  
  
 \\ \\
 \\ \\
-==== Problems with Feedback-Only Controllers====+====Requirements for Evaluation: Settings That Must Be Obtainable====
  
-|  \\ \\ Thermostat  | A cooling thermostat has a built-in supersimple model of its task-environment, one that is sufficient for it to do its job. It consists of a few variables, an on-off switchtwo thresholdsand two simple rules that tie these together; the sensed temperature variable, the upper threshold for when to turn the heater on, and the lower threshold for when to turn the heater off. The thermostat never has to decide which model is appropriate, it is “baked into it" by the thermostat’s designer. It is not a predictive (forward) model, this is a strict feedback model. \\ The thermostat cannot change its model, this can only be done by the user opening it and twiddling some thumbscrews.    +|  Complexity  | ENVIRONMENT IS COMPLEX WITH DIVERSE INTERACTING OBJECTS   |   
-|  Limitation  | Because the system designer knows beforehand which signals cause perturbations in <m>o</m> and can hard-wire these from the get-go in the thermostat, there is no motivation to create a model-creating controller (it is much harder!).  | +|  Dynamicity  | ENVIRONMENT IS DYNAMIC  |   
-  Other "state of the art" systems   | The same is true for expert systems, subsumption robots, and general game playing machinestheir model is to tightly baked into their architecture by the designerYes, there are some variables in these that can be changed automatically “after the machine leaves the lab” (without designer intervention), but they are parameters inside a (more or less) already-determined //model//.    | +|  Regularity  | TASK-RELEVANT REGULARITIES EXIST AT MULTIPLE TIME SCALES  |   
-|  What Can We Do Feed-forward control! Which requires **models**.   | +|  Task Diversity  | TASKS CAN BE COMPLEXDIVERSEAND NOVEL    
- +|  Interactions  | AGENT/ENVIRONMENT/TASK INTERACTIONS ARE COMPLEX AND LIMITED  |   
 +|  Computational limitations  | AGENT COMPUTATIONAL RESOURCES ARE LIMITED  |   
 +|  Persistence  | AGENT EXISTENCE IS LONG-TERM AND CONTINUAL  |   
 + REF   | [[http://www.atlantis-press.com/php/download_paper.php?id=1900|Laird et al.]]   |
  
 \\ \\
 \\ \\
  
-==== Benefits of Combined Feed-forward + Feedback Controllers ==== 
-|  Ability to Predict  | With the ability to predict comes the ability to deal with events that happen faster than the perception-action cycle of the controller, as well as the ability to anticipate events far into the future.   | 
-|  \\ Greater Potential to Learn  | A machine that is free to create, select, and evaluate models operating on observable and hypothesized variables has potential to learn anything (within the confines of the algorithms it has been given for these operations) because as long as the range of possible models is reasonably broad and general, the topics, tasks, domains, and worlds it could (in theory) handle becomes vastly larger than systems where a particular model is given to the system a priori (I say ‘in theory’ because there are other factors, e.g. the ergodicity of the environment and resource constraints that must be favorable to e.g. the system’s speed of learning).   | 
-|  Greater Potential for Cognitive Growth  | A system that can build models of its own model creation, selection, and evaluation has the ability to improve its own nature. This is in some sense the ultimate AGI (depending on the original blueprint, original seed, and some other factors of course) and therefore we only need two levels of this, in theory, for a self-evolving potentially omniscient/omnipotent (as far as the universe allows) system.   | 
-|  Bottom Line  | AGI without both feed-forward and feed-back mechanisms is fairly unthinkable.    | 
- 
- 
- 
- 
-[[http://cadia.ru.is/wiki/public:t-720-atai:atai-19:main|T-720-ATAI-2019 Main]] \\ 
-[[http://cadia.ru.is/wiki/public:t-720-atai:atai-19:Lecture_Notes|Links to Lecture Notes]] 
  
 +====Example Frameworks for Evaluating AI Systems====
 +|  \\ \\ Merlin  | A significant problem facing researchers in reinforcement and multi-objective learning is the lack of good benchmarks. Merlin (for Multi-objective Environments for Reinforcement LearnINg) is a software tool and method for enabling the creation of random problem instances, including multi-objective learning problems, with specific structural properties. Merlin provides the ability to control task features in predictable ways allowing researchers to build a more detailed understanding about what features of a problem interact with a given learning algorithm, improving or degrading its performance.   |  [[http://alumni.media.mit.edu/~kris/ftp/Tunable-generic-Garrett-etal-2014.pdf|Paper]] by Garrett et al.  |
 +|  \\ FRaMoTEC  | Framework that allows modular construction of physical task-environments for evaluating intelligent control systems. A proto- task theory on which the framework is built aims for a deeper understanding of tasks in general, with a future goal of providing a theoretical foundation for all resource-bounded real-world tasks. Tasks constructed in the framework can be rooted in physics, allowing us their execution to analyze the performance of control systems in terms of expended time and energy.   |  [[http://alumni.media.mit.edu/~kris/ftp/EGPAI_2016_paper_8.pdf|Paper]] by Thorarensen et al.   |
 +|  AI Gym  | Gym is a toolkit developed by OpenAI for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Pinball.    |  [[https://gym.openai.com|Link]] to Website.  |  
  
 \\ \\
 \\ \\
-====Important Concepts for AGI==== 
  
-|  Autonomy   | The ability to do tasks without interference / help from others / outside || + 
-|  Reasoning  | The application of logical rules to knowledge.  || +====State of the Art==== 
-|  Attention  | The management of processingmemory, and sensory resources.   || +|  Summary   | Practically all proposals to date for evaluating intelligence leave out some major important aspects of intelligenceVirtually no proposals exist for evaluation of knowledge transferattentional capabilities, knowledge acquisition, knowledge capacity, knowledge retention, multi-goal learning, social intelligence, creativity, reasoning, cognitive growth, and meta-learning / integrated cognitive control -- all of which are quite likely vital to achieving general intelligence on par with human.  | 
-|  Meta-Cognition  | The ability of a system to reason about itself.  |+|  What is needed  | A theory of intelligence that allows us to construct adequate, thorough, and comprehensive tests of intelligence and intelligent behavior.  | 
-|  \\ Understanding  | The phenomenon of "understanding" has been neglected in AI and AGI. Modern AI systems do not //understand//. \\ Yet the concept seems crucial when talking about human intelligence; the concept holds explanatory power - we do not assign responsibilities for a task to someone or something with a demonstrated lack of understanding of the task. Moreoverthe level of understanding can be evaluated. \\ Understanding of a particular phenomenon <m>phi</m> is the potential to perform actions and answer questions with respect to <m>phi</m>. Example: Is an automobile heavier or lighter than a human?     || +|  What can be done  In leu of such a theory (which still is not forthcoming after over 100 years of psychology and 60 years of AI) we could use a multi-dimensional "Legokit for exploring various means of measuring intelligence and intelligent performanceso as to be able to evaluate the pros and cons of various approaches, methods, scales, etc. \\ Some sort of kit meeting part or all of the requirements listed above would go a long way to bridging the gap, and possibly generate some ideas that could speed up theoretical development   |
-|  | \\ Explanation  | When performed by an agent, the ability to transform knowledge about X from a formulation primarily (or only) good for execution with respect to X to a formulation good for being communicated (typically involving some form of linearization, incremental introduction of concepts and issues, in light of an intended receiving agent with a particular a-priori knowledge)\\ Is it possible to explain something that you don't understand?  | +
-|  \\ Learning   Acquisition of information in a form that enables more successful completion of tasks. We call information in such a form "knowledge" or "practical knowledge". (There is also the concept of "impractical knowledge", which sometimes people feel must be the case of "useless triviathat seems to be useless for anythingbut can in fact turn out to be useful at any point, as for instance using such trivia to wow others with one's knowledge of trivia.)   || +
-|  | Life-long Learning   | Incremental acquisition of knowledge throughout a (non-trivially long) lifetime.  | +
-|  | Transfer Learning   | The ability to transfer what has been learned in one task to another.  | +
-|  Imagination  | The ability to evaluate potential contingencies. Also used to describe the ability to predict. \\ Relies on reasoning and understanding.  || +
-|  Creativity  | A measure for the uniqueness of solutions to problems produced by an agent; or the ability of an agent to produce solution(s) where other agents could not. Also used as a synonym of intelligence. \\ An emergent property of an intelligent agent that relies on several of the above features  ||+
  
 \\ \\
Line 108: Line 128:
 \\ \\
 \\ \\
- +2019(c)K.R.Thórisson \\
- +
-2019(c)K. R. Thórisson  +
-\\ +
-\\+
 //EOF// //EOF//
  
/var/www/ailab/WWW/wiki/data/pages/public/t-720-atai/atai-19/lecture_notes_w7.txt · Last modified: 2019/09/10 15:02 by thorisson