public:t-720-atai:atai-21:learning
Table of Contents
Learning
Terminology & Definitions
Key Learning Terms
What it is | Learning is a process that has the intent of acquiring actionable information, a.k.a. knowledge. |
Key Features | Inherits key features of any process: - Purpose: To adapt, to respond in rational ways to problems / to achieve foreseen goals; this factor determines how the rest of the features in this list are measured. - Speed: The speed of learning. - Data: The data that the learning (and particular measured speed of learning) requires. - Quality: How well something is learned. - Retention: The robustness of what has been learned - how well it stays intact over time. - Transfer: How general the learning is, how broadly what is learned can be employed for the purposes of adaptation or achievement of goals. - Meta-Learning: A learner may improve its learning abilities - i.e. capable of meta-learning. - Progress Signal(s): A learner needs to know how its learning is going, and if there is improvement, how much. |
Measurements | To know any of the above some parameters have to be measured: All of the above factors can be measured in many ways. |
Major Caveat | Since learning interacts with (is affect by) the task-environment and world that the learning takes place in, as well as the nature of these in the learner's subsequent deployment, none of the above features can be assessed by looking only at the learner. This is addressed by the Pedagogical Pentagon. |
Learning Controllers
A Learner | Adaptive/intelligent system/controller, embodied and situated in a task-environment, that continually receives inputs/observations (measurements) from its environment and sends outputs/actions back (signals to its manipulators). Some of the learner’s inputs may be treated specially — e.g. as feedback or a reward signal, possibly provided by a teacher or a specially-rigged training task-environment. Since action can only be evaluated as “intelligent” in light of what it is trying to achieve - we model intelligent agents as imperfect optimizers of some (possibly unknown) real-valued objective function. Note that this working definition fits experience-based learning. |
Embodiment | The interface between a learning controller and the task-environment. |
Data, Information, Knowledge
Measurement | Sampling of a value of one or more variables over a particular temporal interval. (Often simplified by considering it coming from a “point” in time.) |
Data | Stored, committed-to Measurement. Data that can be/is formatted and used for a purpose. |
Information | A set of Data grouped for a purpose. |
Knowledge | A set of interlinked information that can be used to plan, produce action, and interpret new information. Actionable information. Information that can be used to get stuff done. |
Terms Used for Describing Learning Styles
Reflex | A reflex behavior is controlled by an (architecturally) fixed circuit (“arch”) that is not shaped by experience. This defines the lower bound of natural learners' behavior: It is Not Learnable. |
|
Reinforcement Learning | Learning proceeds through discrete steps whereby a step is of the kind A-R pair, A being an action and R being a reward. | |
Observational Learning | Many animals have been observed to learn by observation (no pun intended). In some cases by observing members of the same species doing that particular thing (called “conspecifics”), in other cases by watching events unfold. | |
Morphological Observational Learning | A.k.a. Structural-. Learning is restricted to the morphology (structure) of the movements and/or actions being observed. May be sufficient when learning to dance, but certainly less useful when learning to conduct an orchestra. | |
Goal-Level Observational Learning | Learning includes learning the purpose for which the observed actions are performed. |
|
Life-Long Learning | Colloquially: Learning throughout one's lifetime. In AI: A particular focus of learning research targeting how systems can change their learning over long periods of time. “Duration” doesn't refer to a particular number of hours or years but rather indicates the expectations on the system being engineered that it learn over long periods of time, “long” relative to prior such machine learners, and “long” relative to the system's operational lifetime. |
|
Online Learning | A.k.a. “continuous”- or “simultaneous”-. Learning while doing (same or other) things. | |
Multi-task Learning | A.k.a. “multi-goal”-. The same system learning many tasks/things without forgetting what was learned before. | |
Transfer Learning | The ability to benefit from something already learned when learning something new. | |
Single-Shot Learning | A.k.a. “few-shot”-. The ability to learn something new from one example. | |
Cumulative Learning | New things learned are integrated with things learned prior. The two are fused so as to create a more coherent, more easily-verifiable knowledge set. |
Learning: The Phenomenon
Learning: Means & Methods (The How)
What it is | The acquisition of information in order to create knowledge that can improve performance with respect to some Goal or set of Goals. Learning is a meta-Goal for achieving (other) Goals for which not enough knowledge exists to achieve. A circularity may exist: There may not be enough knowledge available to acquire the knowledge for achieving other Goals. |
Learning from experience | A method for learning. Also called “learning by doing”: An Agent <m>A</m> does action <m>a</m> to phenomenon <m>p</m> in context <m>c</m> and uses the result to improve its ability to act on Goals involving <m>p</m>. All higher-level Earth-bound intelligences learn from experience. |
Learning by observation | A method for learning. An Agent <m>A</m> learns how to achieve Goal <m>G</m> by receiving realtime information about some other Agent <m>A'</m> achieving Goal <m>G</m> by doing action <m>a</m>. |
Learning from reasoning | A method for learning. Using deduction, induction and/or abduction to simulate, generalize, and infer, respectively, new information from acquired information. Most effectively used in combination with Learning from Experience. |
Learning from Teaching | A method for learning. Use of Instructions, provided by a Teacher, to improve knowledge. Teaching is typically situated, i.e. provided on-demand (during learning/training). |
Multi-objective learning | Learning while aiming to / learning how to achieve more than one Goal. (There is a strange concept usage in some circles, where 'multi-objective learning' means 1-5 objectives (goals), and 'many-objective learning' means >5 objectives.) This is a clear example of how terminology can get twisted in the “fight” for attention between academics. It comes about in part because of the combinatorial explosion with more than 5 objectives.) |
Transfer learning | Applying already-acquired knowledge to a new or newish Problem. A method for learning faster based on similarity identification. By not having to re-learn highly similar things to what has already been learned and adapting/mapping (modifying) existing knowledge to new problems. |
Transversal (i.e. System-Wide) Ampliative Learning | What we could call a combination of all of the above. |
Learning: Targets Categories (The What)
What is Being Learned | Categories: - Tool (body) - Task-environment (the task at hand) - Domain-bound strategies - Domain-independent learning - Domain-independent learning strategies (“cognitive development”) Each one subsumes the ones above. |
Tool (body) | A controller needs to be embodied to affect the world; learning what the body does, irrespective of the task-environment, domain, or other issues. |
Task-Environment | The proverbial “task” that an agent has been assigned in a particular “environment”. Typically, 'task' is anything that would stay unchanged between environments, and 'environment' refers to anything else that still may affect task performance but should ideally stay out of the way (or be unchanged during task exectuion). |
Domain-Bound Strategies | Strategies related to specific issues in the task-domain but learning may temporarily slow down learning the task-environment. |
Domain-Independent Learning | Refers to the concept of “learning to learn” - learning that is transferrable between domains. |
Domain-Independent Learning Strategies (“cognitive development”) | Strategies that are so profound as to affect a learner's ability to learn particular domains. It involves changes to the methods of learning. |
Properties of the Learning Process
(may be set and measured by learner, task-env, and/or designer)
Purpose | For closed problems (problems with clear goals), and for which the task-environment is known, the purpose of learning can often be readily specified. In human education the purpose often gets conflated with the ways we test the learning (e.g. a child's learning of the alphabet is measured by its ability to recite the alphabet, rather than its ability to use it to search a dictionary efficiently). In school the task-environment part of this equation often gets ignored. |
Speed | Can be measured during a learning session by measuring state of knowledge at time <m>t_1</m> and again at time <m>t_2</m>. We should expect any (general) learner to exhibit various learning speeds at various times for various topics. For human education, not many methods have been developed to measure directly an individual's learning; this is mostly done implicitly (and very approximately!) by grouping learners by age. |
Data | There are many ways to classify data; too numerous to recount here. Suffice it to say that every problem and task brings with it a unique mixture of data; a general learner should be able to handle a broad spectrum of these. For present ML systems we can e.g. say that DNNs can handle big continuous data, reinforcement learning is good for discrete-valued small data, but no good methods exist for diverse datasets with a mixture of symbolic, continuous, big- and small- data. |
Quality | Quality of learning has at least two dimensions, reliability and applicability. Reliability refers to the learned behavior/material's consistent performance under repeated application; applicability refers to its correct application in relevant circumstances. |
Retention | A battery of tests administered at time <m>t_1</m> and then again at times <m>t_2</m>, <m>t_3</m>, and <m>t_4</m>, would give an indication of a function describing the learner's retention of acquired material/skills/concepts. |
Transfer | How well something that has been verified to have been learned at time <m>t_1</m> can be used by the learner subsequently (whether in identical situations, similar situations, or completely different situations). Partly a function of Retention (Transfer would be zero if Retention is zero), as well as Quality and Reliability. In human education, because the task-environment often gets ignored, transfer learning is seldom if ever evaluated. |
Meta-Learning | The rate at which Meta-Learning happens could possibly be measured by the frequency with which the learning changes its nature, either in terms of new things, concepts of phenomena that it now can handle (as a class) as opposed to before, or WRT orders of magnitude (measured somehow) changes in of knowledge acquisition on the other dimensions above (Speed, Data, Quality, Retention, and Transfer). |
Progress Signal(s) | For artificial learners these are known and thus do not have to be measured. For natural learners the main method for assessing what kinds of progress signals are allowed and/or needed, or which ones work best, is experimentation (e.g. food pellets for dogs); some methods are already well explored and documented for certain species of animals (e.g. classical conditioning and operant conditioning). |
Examples of Learning Data & Representation
DATA TYPE | Example Learning Method |
---|---|
Stimulus-Response | Reinforcement learning; association. |
Time-Series | Linked-list reinforcement learning; Markov chains. |
2D Data | ANNs of some sort. |
3D Structures | Models of some sort. |
Complex Systems | Models of some sort. |
Learning Paradigms
Learning Paradigms
Learning From Input/Output Pairs “Supervised Learning” | The ability to learn a mapping from inputs to outputs based on examples of input-output pairs. This requires having a way to perceive what the output to a particular input should have been. |
No Feedback “Unsupervised Learning” | The ability to learn patterns in the input even though no external feedback is given. Examples include clustering, anomaly detection and dimensionality reduction. Note that without some sort of goal, this kind of learning is without purpose and objectives (which means it should not, strictly speaking, be called “learning”, because there is no way to answer the question 'have you learned it yet?'). |
Learning from Rewards “Reinforcement Learning” | The ability to learn from a series of (positive and negative) rewards. This is usually used to learn how to behave in multi-step control problems. It requires machinery to treat certain perceptions (i.e.~the rewards) as “special” and something to be optimized for. |
Learning from Teaching | Can be done in a wide variety of ways, each of which might impose their own requirements on the AI architecture. For instance, imitation learning – the ability to learn behaviors by observing another agent carry them out – requires a deep understanding of the perceived actions to be imitated, meaning the system must not only be able to observe those actions, but also recognize those actions, map them to its own perspective and body, and possibly infer their intent. |
Reinforcement Learning
Input Data | Discrete variables. |
Output Data | Discrete variables. |
Max. # I/O Vars. | <m>card_max(I union O) ⇐ 8 </m> , preferably |
Min. Cycles | Depends on the number of input and output variables. |
Training | On-task. |
Training Style | Turn-based; discrete time steps. |
Training Signal | Explicit, discrete, every turn (after each action). |
Hyper-Parameters | - Learning Rate. - Exploration vs. exploitation. |
Scalability | Max. Vars. limits what can be learned. Discretization of I/O vars limits the types of vars. that can be handled. Cannot handle n-modal distributions or conditional out-of-phase vars. unless known BILL (before it leaves the lab). Task-environments must allow turn-based learning. I/O vars., discretization, must be BILL. |
Artificial Neural Nets (ANNs)
Input Data | Continuous and discrete variables. |
Output Data | Continuous and discrete variables. |
Max. # I/O Vars. | Very high (possibly limited by CPU and BW only). |
Min. Training Cycles | Typical numbers 4k-10k. Depends on data complexity and number of layers in the ANN. |
Training | Off-task. Learning turned off when fully trained. |
Training Style | Training phase BILL (before it leaves the lab); discrete training steps. |
Training Signal | Supervised learning: Explicit “error signal propagation” after every turn, generated from pre-categorized examples and outcomes. Unsupervised: explicit “error signal propagation” after every turn, auto-generated. |
Hyper-Parameters | - Learning Rate, Exploration/Exploitation, and many others |
Strengths | Handles complex data sets. |
Scalability | Unpredictable behavior under data drift AILL (after it leaves the lab). Must be trained BILL (unpredictable learning AILL). |
Experience-Based Learning
What It Is | Learning is the acquisition of knowledge for particular purposes. When this acquisition happens via interaction with an environment it is experience-based. |
Why It Is Important | Any environment which cannot be fully known a-priori requires experimentation of some sort, in the form of interaction with the world. This is what we call experience. |
The Physical World | The physical world we live in, often referred to as the “real world”, is highly complex, and rarely if ever do we have perfect models of how it behaves when we interact with it, whether it is to experiment with how it works or simply achieve some goal like buying bread. |
Limited Time & Resources | An important limitation on any agent's ability to model the real world is its enormous state space, which vastly outdoes any known agent's memory capacity, even for relatively simple environments. Even if the models were sufficiently detailed, pre-computing everything beforehand is prohibited due to memory. On top of that, even if memory would suffice for pre-computing everything and anything necessary to go about our tasks, we would have to retrieve the pre-computed data in time when it's needed - the larger the state space the more demands on retrieval times this puts. |
Cumulative Learning
What it Is | Unifies several separate research tracks in a coherent form easily relatable to AGI requirements: Multitask learning, lifelong learning, transfer learning and few-shot learning. |
Multitask Learning | The ability to learn more than one task, either at once or in sequence. The cumulative learner's ability to generalize, investigate, and reason will affect how well it implements this ability. Subsumed by cumulative learning because knowledge is contextualized as it is acquired, meaning that the system has a place and a time for every tiny bit of information it absorbs. |
Online Learning | The ability to learn continuously, uninterrupted, and in real-time from experience as it comes, and without specifically iterating over it many times. Subsumed by cumulative learning because new information, which comes in via experience, is integrated with prior knowledge at the time it is acquired, so a cumulative learner is always learning as it's doing other things. |
Lifelong Learning | Means that an AI system keeps learning and integrating knowledge throughout its operational lifetime: learning is “always on”. Whichever way this is measured we expect at a minimum the `learning cycle' – alternating learning and non-learning periods – to be free from designer tampering or intervention at runtime. Provided this, the smaller those periods become (relative to the shortest perception-action cycle, for instance), to the point of being considered virtually or completely continuous, the better the “learning always on” requirement is met. Subsumed by cumulative learning because the continuous online learning is steady and ongoing all the time – why switch it off? |
Robust Knowledge Acquisition | The antithesis of which is brittle learning, where new knowledge results in catastrophic perturbations of prior knowledge (and behavior). Subsumed by cumulative learning because new information is integrated continuously online, which means the increments are frequent and small, and inconsistencies in the prior knowledge get exposed in the process and opportunities for fixing small inconsistencies are also frequent because the learning is life-long; which means new information is highly unlikely to result in e.g. catastrophic forgetting. |
Transfer Learning | The ability to build new knowledge on top of old in a way that the old knowledge facilitates learning the new. While interference/forgetting should not occur, knowledge should still be defeasible: the physical world is non-axiomatic so any knowledge could be proven incorrect in light of contradicting evidence. Subsumed by cumulative learning because new information is integrated with old information, which may result in exposure of inconsistencies, missing data, etc., which is then dealt with as a natural part of the cumulative learning operations. |
Few-Shot Learning | The ability to learn something from very few examples or very little data. Common variants include one-shot learning, where the learner only needs to be told (or experience) something once, and zero-shot learning, where the learner has already inferred it without needing to experience or be told. Subsumed by cumulative learning because prior knowledge is transferrable to new information, meaning that (theoretically) only the delta between what has been priorly learned and what is required for the new information needs to be learned. |
Theories of Human "Learning Styles"
VARK | Theory that (human) learners can be divided according to their “learning styles”: - Visual - Auditory - Reading - Kinesthetic |
|
Original idea based on skimpy observational evidence in the early 90s source | ||
Gardner's Multiple Intelligence Model | - Linguistic intelligence (“word smart”) - Logical-mathematical intelligence (“number/reasoning smart”) - Spatial intelligence (“picture smart”) - Bodily-Kinesthetic intelligence (“body smart”) - Musical intelligence (“music smart”) - Interpersonal intelligence (“people smart”) - Intrapersonal intelligence (“self smart”) - Naturalist intelligence (“nature smart”) |
|
The Good | Emphasizes that there are multiple ways of learning. | |
The Bad | Very human-centric, not very applicable to AI. | |
The Ugly | Not rooted in a well-grounded theory of learning that references key features of learning like memorization, understanding, knowledge transfer, retention, learning speed. | |
Bottom Line | Many people view learning styles theories as broadly accurate, but, in fact, scientific support for these theories is severely lacking. |
2021©K.R.Thórisson
EOF
/var/www/cadia.ru.is/wiki/data/pages/public/t-720-atai/atai-21/learning.txt · Last modified: 2024/04/29 13:33 by 127.0.0.1