User Tools

Site Tools


public:t_720_atai:atai-18:lecture_notes_w7

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

public:t_720_atai:atai-18:lecture_notes_w7 [2018/10/04 12:25] – [Artificial Neural Nets (ANNs)] thorissonpublic:t_720_atai:atai-18:lecture_notes_w7 [2024/04/29 13:33] (current) – external edit 127.0.0.1
Line 14: Line 14:
 |  What it is  | The acquisition of information in order to improve performance with respect to some Goal or set of Goals.   | |  What it is  | The acquisition of information in order to improve performance with respect to some Goal or set of Goals.   |
 |   Learning from experience   | A method for learning. Also called "learning by doing": An Agent <m>A</m> does action <m>a</m> to phenomenon <m>p</m> in context <m>c</m> and uses the result to improve its ability to act on Goals involving <m>p</m>. All higher-level Earth-bound intelligences learn from experience.  | |   Learning from experience   | A method for learning. Also called "learning by doing": An Agent <m>A</m> does action <m>a</m> to phenomenon <m>p</m> in context <m>c</m> and uses the result to improve its ability to act on Goals involving <m>p</m>. All higher-level Earth-bound intelligences learn from experience.  |
-|   Learning by obser  on   | A method for learning. An Agent <m>A</m> learns how to achieve Goal <m>G</m> by receiving realtime information about some other Agent <m>A'</m> achieving Goal <m>G</m> by doing action <m>a</m> |+|   Learning by observation   | A method for learning. An Agent <m>A</m> learns how to achieve Goal <m>G</m> by receiving realtime information about some other Agent <m>A'</m> achieving Goal <m>G</m> by doing action <m>a</m> |
 |   Learning from reasoning   | A method for learning. Using deduction, induction and abduction to simulate, generalize, and infer, respectively, new information from acquired information. \\ Most effectively used in combination with Learning from Experience.   | |   Learning from reasoning   | A method for learning. Using deduction, induction and abduction to simulate, generalize, and infer, respectively, new information from acquired information. \\ Most effectively used in combination with Learning from Experience.   |
 |   Multi-objective learning   | Learning while aiming to achieve more than one Goal.   | |   Multi-objective learning   | Learning while aiming to achieve more than one Goal.   |
Line 93: Line 93:
 |   Input Data   | Discrete variables.    | |   Input Data   | Discrete variables.    |
 |   Output Data   | Discrete variables.    | |   Output Data   | Discrete variables.    |
-|   Max. Vars.    | Preferably <m>card_max(I union O) <= 8 </m>     |+|   Max. # I/O Vars.    | <m>card_max(I union O) <= 8 </m> , preferably    |
 |   Min. Cycles   | Depends on the number of input and output variables.    | |   Min. Cycles   | Depends on the number of input and output variables.    |
 |  Training  | On-task.   | |  Training  | On-task.   |
Line 107: Line 107:
 |   Input Data   | Continuous and discrete variables.    | |   Input Data   | Continuous and discrete variables.    |
 |   Output Data   | Continuous and discrete variables.    | |   Output Data   | Continuous and discrete variables.    |
-|   Max. I/O Vars. Cardinality    | Very high (limited by CPU and BW).     | +|   Max. I/O Vars.    | Very high (limited by CPU and BW).     | 
-|   Min. Cycles   | Typical numbers > 4000. Depends on data complexity and number of layers in the ANN.     |+|   Min. Cycles   | Typical numbers 4k-10k\\ Depends on data complexity and number of layers in the ANN.     |
 |  Training  | Off-task. Learning turned off when fully trained.    | |  Training  | Off-task. Learning turned off when fully trained.    |
 |  Training Style  | Training phase BILL; discrete training steps.   | |  Training Style  | Training phase BILL; discrete training steps.   |
Line 119: Line 119:
 \\ \\
 ====Cumulative Learning==== ====Cumulative Learning====
-|  What it Is  | Unifies several separate research tracks in a coherent form easily relatable to AGI requirements.   | +|  What it Is  | Unifies several separate research tracks in a coherent form easily relatable to AGI requirements: Multitask learning, lifelong learning, transfer learning and few-shot learning.   | 
-|  Multitask Learning   | The ability to learn more than one task, either at once or in sequence   +|  Multitask Learning   | The ability to learn more than one task, either at once or in sequence. \\ The cumulative learner's ability to generalize, investigate, and reason will affect how well it implements this ability. \\ //Subsumed by cumulative learning because knowledge is contextualized as it is acquired, meaning that the system has a place and a time for every tiny bit of information it absorbs.//     
-|  Online Learning  | The ability to learn continuously, uninterrupted, and in real-time from experience as it comes, and without specifically iterating over it many times.    +|  Online Learning  | The ability to learn continuously, uninterrupted, and in real-time from experience as it comes, and without specifically iterating over it many times. \\ //Subsumed by cumulative learning because new information, which comes in via experience, is //integrated// with prior knowledge at the time it is acquired, so a cumulative learner is //always learning// as it's doing other things.//   
-|  \\ Lifelong Learning  | Means that an AI system keeps learning and integrating knowledge throughout its operational lifetime: learning is "always on". \\ Whichever way this is measured we expect at a minimum the `learning cycle' -- alternating learning and non-learning periods -- to be free from designer tampering or intervention at runtime. Provided this, the smaller those periods become (relative to the shortest perception-action cycle, for instance), to the point of being considered virtually or completely continuous, the better the "learning always on" requirement is met.   | +|  \\ Lifelong Learning  | Means that an AI system keeps learning and integrating knowledge throughout its operational lifetime: learning is "always on". \\ Whichever way this is measured we expect at a minimum the `learning cycle' -- alternating learning and non-learning periods -- to be free from designer tampering or intervention at runtime. Provided this, the smaller those periods become (relative to the shortest perception-action cycle, for instance), to the point of being considered virtually or completely continuous, the better the "learning always on" requirement is met. \\ //Subsumed by cumulative learning because the continuous online learning is steady and ongoing all the time -- why switch it off?//   | 
-|  Transfer Learning  | The ability to build new knowledge on top of old in a way that the old knowledge facilitates learning the new. While interference/forgetting should not occur, knowledge should still be defeasible: the physical world is non-axiomatic so **//any//** knowledge could be proven incorrect in light of contradicting evidence.   | +|  Robust Knowledge Acquisition  | The antithesis of which is brittle learning, where new knowledge results in catastrophic perturbations of prior knowledge (and behavior). \\ //Subsumed by cumulative learning because new information is //integrated// continuously online, which means the increments are frequent and small, and inconsistencies in the prior knowledge get exposed in the process and opportunities for fixing small inconsistencies are also frequent because the learning is life-long; which means new information is highly unlikely to result in e.g. catastrophic forgetting.//   | 
-|  Few-Shot Learning  | The ability to learn something from very few examples or very little data. Common variants include one-shot learning, where the learner only needs to be told (or experience) something once, and zero-shot learning, where the learner has already inferred it without needing to experience or be told.   |+|  Transfer Learning  | The ability to build new knowledge on top of old in a way that the old knowledge facilitates learning the new. While interference/forgetting should not occur, knowledge should still be defeasible: the physical world is non-axiomatic so **//any//** knowledge could be proven incorrect in light of contradicting evidence. \\ //Subsumed by cumulative learning because new information is //integrated// with old information, which may result in exposure of inconsistencies, missing data, etc., which is then dealt with as a natural part of the cumulative learning operations.//   | 
 +|  Few-Shot Learning  | The ability to learn something from very few examples or very little data. Common variants include one-shot learning, where the learner only needs to be told (or experience) something once, and zero-shot learning, where the learner has already inferred it without needing to experience or be told. \\ //Subsumed by cumulative learning because prior knowledge is transferrable to new information, meaning that (theoretically) only the delta between what has been priorly learned and what is required for the new information needs to be learned.//    |
  
  
Line 141: Line 142:
 \\ \\
 \\ \\
-====The Pedagogical Pentagon====+====The Pedagogical Pentagon: A Framework for (Artificial) Pedagogy====
 |  \\ What is Needed  | We need a //universal theory of learning// -- and in fact of //teaching//, //training//, //task-environments//, and //evaluation//  \\ Anything short of having complete theories for all of these means that experimentation, exploration, and blind search are the only ways to answer questions about performance, curriculum design, training requirements, etc., from which we can never get more than partial, limited answers.     | |  \\ What is Needed  | We need a //universal theory of learning// -- and in fact of //teaching//, //training//, //task-environments//, and //evaluation//  \\ Anything short of having complete theories for all of these means that experimentation, exploration, and blind search are the only ways to answer questions about performance, curriculum design, training requirements, etc., from which we can never get more than partial, limited answers.     |
 |  {{http://cadia.ru.is/wiki/_media/pedagogical_pentagon_full1.png}}  || |  {{http://cadia.ru.is/wiki/_media/pedagogical_pentagon_full1.png}}  ||
Line 147: Line 148:
 |  \\ The Learner  | Intelligent systems continually receive inputs/observations from their environment and send outputs/actions back. Some of the system’s inputs may be treated specially — e.g. as feedback or a reward signal, possibly provided by a teacher. Since intelligent action can only be called "intelligent" if it is trying to achieve something - against which the level of intelligence can be evaluated - we model intelligent agents as imperfect optimizers of some (possibly unknown) real-valued objective function.      | |  \\ The Learner  | Intelligent systems continually receive inputs/observations from their environment and send outputs/actions back. Some of the system’s inputs may be treated specially — e.g. as feedback or a reward signal, possibly provided by a teacher. Since intelligent action can only be called "intelligent" if it is trying to achieve something - against which the level of intelligence can be evaluated - we model intelligent agents as imperfect optimizers of some (possibly unknown) real-valued objective function.      |
 |  \\ Tasks  | Learning systems adjust their knowledge as a result of interactions with a task- environment. Defined by (possibly a variety of) objective functions, as well as (possibly) instructions (i.e. knowledge provided at the start of the task, e.g. as a "seed", or continuously or intermittently throughout its duration). Since tasks can only be defined w.r.t. some environment, we often refer to the combination of a task and its environment as a single unit: the task-environment.   | |  \\ Tasks  | Learning systems adjust their knowledge as a result of interactions with a task- environment. Defined by (possibly a variety of) objective functions, as well as (possibly) instructions (i.e. knowledge provided at the start of the task, e.g. as a "seed", or continuously or intermittently throughout its duration). Since tasks can only be defined w.r.t. some environment, we often refer to the combination of a task and its environment as a single unit: the task-environment.   |
-|  \\ Teacher  | The goal of the teacher is to influence the learner’s task-environments in such a way that progress towards the is facilitated. The teacher’s teaching task is to change the learner’s knowledge in some way (e.g. to make the learner understand something, or increase the learner’s skill on some metric).    |+|  \\ Teacher  | The goal of the teacher is to influence the learner’s task-environments in such a way that progress towards the learning goal is facilitated. The teacher’s teaching task is to change the learner’s knowledge in some way (e.g. to make the learner understand something, or increase the learner’s skill on some metric).    |
 |   Environment & Task   | The learner and the teacher each interact with their own view of the world (i.e. their own “environments”) which are typically different, but overlapping to some degree.  | |   Environment & Task   | The learner and the teacher each interact with their own view of the world (i.e. their own “environments”) which are typically different, but overlapping to some degree.  |
 |  \\ Training  | Viewed from a teacher’s and intentional learner’s point of view, “training” means the actions taken (repeatedly) over time with the goal of becoming better at some task, by avoiding learning erroneous skills/things and avoid forgetting or unlearning desirable skills/things.    | |  \\ Training  | Viewed from a teacher’s and intentional learner’s point of view, “training” means the actions taken (repeatedly) over time with the goal of becoming better at some task, by avoiding learning erroneous skills/things and avoid forgetting or unlearning desirable skills/things.    |
Line 161: Line 162:
 |  State of the Art  | No good scientific theory of teaching exists.    |  |  State of the Art  | No good scientific theory of teaching exists.    | 
 |  \\ Why Artificial Pedagogy?  | - Current machine teaching is ad hoc. \\ - Sophisticated teaching needed in complex domains. \\ - Sufficiently advanced learners now exist. \\ - Relevance will increase as AI field advances.   | |  \\ Why Artificial Pedagogy?  | - Current machine teaching is ad hoc. \\ - Sophisticated teaching needed in complex domains. \\ - Sufficiently advanced learners now exist. \\ - Relevance will increase as AI field advances.   |
-|  \\ Programming vs. Teaching  | Programming: \\ – minimal seed knowledge required \\ – precise \\ Teaching: \\ – natural \\ – adaptive \\ – on-the-fly \\ – can’t program everything    |+|  \\ Programming \\ vs. \\ Teaching  | Programming: \\ – minimal seed knowledge required \\ – precise \\ \\ Teaching: \\ – natural \\ – adaptive \\ – on-the-fly \\ – can’t program everything    |
 |  \\ Teaching Methods  | - Heuristic Rewarding \\ - Decomposition \\ - Simplification \\ - Situation Selection \\ - Teleoperation \\ - Demonstration \\ - Coaching \\ - Explanation \\ - Cooperation \\ - Socratic method     | |  \\ Teaching Methods  | - Heuristic Rewarding \\ - Decomposition \\ - Simplification \\ - Situation Selection \\ - Teleoperation \\ - Demonstration \\ - Coaching \\ - Explanation \\ - Cooperation \\ - Socratic method     |
  
Line 169: Line 170:
 ==== Artificial Pedagogy Tutoring Methods ==== ==== Artificial Pedagogy Tutoring Methods ====
 |  Heuristic Rewards  | Giving the learner intermediate feedback about performance \\ Related: Reward shaping, Gamification, Heuristics in e.g. minimax game playing \\ RL example: Different reward for positive/negative step   | |  Heuristic Rewards  | Giving the learner intermediate feedback about performance \\ Related: Reward shaping, Gamification, Heuristics in e.g. minimax game playing \\ RL example: Different reward for positive/negative step   |
-|  Decomposition  | Decomposition of whole, complex tasks into smaller components \\ Related: Whole-task vs. part-task training, Curriculum learning, (Catastrophic interference), (Transferlearning), (Multitask learning). \\ RL example: Sliding puzzle at goal location on grid.    | +|  \\ Decomposition  | Decomposition of whole, complex tasks into smaller components \\ Related: Whole-task vs. part-task training, Curriculum learning, (Catastrophic interference), (Transferlearning), (Multitask learning). \\ RL example: Sliding puzzle at goal location on grid.    | 
 |  Situation Selection  | Selecting situations (or data) for the learner to focs on, e.g. simpler or more difficult situations. \\ Related: Boosting, ML application development, big data, active learning. \\ RL Example: Start (or stop) in problematic states. | |  Situation Selection  | Selecting situations (or data) for the learner to focs on, e.g. simpler or more difficult situations. \\ Related: Boosting, ML application development, big data, active learning. \\ RL Example: Start (or stop) in problematic states. |
-|  Teleoperation  | Temporarily taking control of the learner’s actions so they can experience them. \\ Applications: Tennis/golf/chess, Robot ping-pong, artificial tutor. \\ RL Example: Force good or random moves.     |  +|  \\ Teleoperation  | Temporarily taking control of the learner’s actions so they can experience them. \\ Applications: Tennis/golf/chess, Robot ping-pong, artificial tutor. \\ RL Example: Force good or random moves.     |  
-|  Demonstration  | Showing the learner how to accomplish a task. \\ Requirements: Desire to imitate, ability to map tutor's actions onto own actions, generalization ability. \\ Related: Apprenticeship learning, inverse reinforcement learning, imitation learning. \\ RL Example: Nonexistent.    | +|  \\ Demonstration  | Showing the learner how to accomplish a task. \\ Requirements: Desire to imitate, ability to map tutor's actions onto own actions, generalization ability. \\ Related: Apprenticeship learning, inverse reinforcement learning, imitation learning. \\ RL Example: Nonexistent.    | 
-|  Coaching  | Giving the learner instructions of what action to take during the task. \\ Requirements: Ability to map language-based instruction onto actions, generalization ability. \\ Related: Supervised learning. \\ RL Example: Add input that specifies correct output.     |  +|  \\ Coaching  | Giving the learner instructions of what action to take during the task. \\ Requirements: Ability to map language-based instruction onto actions, generalization ability. \\ Related: Supervised learning. \\ RL Example: Add input that specifies correct output.     |  
-|  Explanation  | Explaining to the learner how to approach certain situations before the starts (a new instance of) the task. \\ Requirements: Language capability, generalization ability. \\ Related: Imperative programming, analogies. \\ RL Example: Nonexistent.     |  +|  \\ Explanation  | Explaining to the learner how to approach certain situations before the starts (a new instance of) the task. \\ Requirements: Language capability, generalization ability. \\ Related: Imperative programming, analogies. \\ RL Example: Nonexistent.     |  
-|  Cooperation  | Doing a task together with the learner to facilitate other tutoring techniques.   |+|  \\ Cooperation  | Doing a task together with the learner to facilitate other tutoring techniques.   |
 |  Socratic Method  | Asking questions to encourage critical thinking and guide the learner towards its own conclusions. \\ Related: Shaping, chaining. \\ RL Example: Nonexistent. \\ **NARS Example:** \\ > <dog --> mammal>. \\ > <<$x --> mammal> --> <$x --> [breaths]>>. \\ > <{Spike} --> dog>. \\ > <{Spike} --> [breaths]>? // main question \\ > <{Spike} --> mammal>? // helping question   | |  Socratic Method  | Asking questions to encourage critical thinking and guide the learner towards its own conclusions. \\ Related: Shaping, chaining. \\ RL Example: Nonexistent. \\ **NARS Example:** \\ > <dog --> mammal>. \\ > <<$x --> mammal> --> <$x --> [breaths]>>. \\ > <{Spike} --> dog>. \\ > <{Spike} --> [breaths]>? // main question \\ > <{Spike} --> mammal>? // helping question   |
  
/var/www/cadia.ru.is/wiki/data/attic/public/t_720_atai/atai-18/lecture_notes_w7.1538655910.txt.gz · Last modified: 2024/04/29 13:33 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki