Both sides previous revisionPrevious revisionNext revision | Previous revision |
public:t-713-mers:mers-23:knowledge [2023/10/02 10:15] – [Probability] thorisson | public:t-713-mers:mers-23:knowledge [2024/04/29 13:33] (current) – external edit 127.0.0.1 |
---|
[[/public:t-713-mers:mers-23:main|DCS-T-713-MERS-2023 Main]] | [[/public:t-713-mers:mers-23:main|DCS-T-713-MERS-2023 Main]] \\ |
| [[http://cadia.ru.is/wiki/public:t-713-mers:mers-23:lecture_notes|Lecture Notes]] |
| |
| |
| |
====Key Learning Terms==== | ====Key Learning Terms==== |
| What it is | Learning is a //**process**// that has the intent of //acquiring actionable information//, a.k.a. **knowledge**. | | | What it is | Learning is a //**process**// that has the purpose of //generating actionable information//, a.k.a. **knowledge**. | |
| \\ \\ \\ Key Features | Inherits key features of any process: \\ - **Purpose**: To adapt, to respond in rational ways to problems / to achieve foreseen goals; this factor determines how the rest of the features in this list are measured. \\ - **Speed**: The speed of learning. \\ - **Data**: The data that the learning (and particular measured speed of learning) requires. \\ - **Quality**: How well something is learned. \\ - **Retention**: The robustness of what has been learned - how well it stays intact over time. \\ - **Transfer**: How general the learning is, how broadly what is learned can be employed for the purposes of adaptation or achievement of goals. \\ - **Meta-Learning**: A learner may improve its learning abilities - i.e. capable of meta-learning. \\ - **Progress Signal(s)**: A learner needs to know how its learning is going, and if there is improvement, how much. | | | \\ \\ \\ Key Features | Inherits key features of any process: \\ - **Purpose**: To adapt, to respond in rational ways to problems / to achieve foreseen goals; this factor determines how the rest of the features in this list are measured. \\ - **Speed**: The speed of learning. \\ - **Data**: The data that the learning (and particular measured speed of learning) requires. \\ - **Quality**: How well something is learned. \\ - **Retention**: The robustness of what has been learned - how well it stays intact over time. \\ - **Transfer**: How general the learning is, how broadly what is learned can be employed for the purposes of adaptation or achievement of goals. \\ - **Meta-Learning**: A learner may improve its learning abilities - i.e. capable of meta-learning. \\ - **Progress Signal(s)**: A learner needs to know how its learning is going, and if there is improvement, how much. | |
| Evaluation | To know any of the above some parameters have to be //measured// somehow: All of the above factors can be measured in //many ways//. | | | Evaluation | To know any of the above some parameters have to be //measured// somehow: All of the above factors can be measured in //many ways//. | |
| |
====The Pedagogical Pentagon==== | ====The Pedagogical Pentagon==== |
| \\ What is Needed | There exists no //universal theory of learning// -- nor of //teaching//, //training//, //task-environments//, and //evaluation//. \\ Anything short of having complete theories for these means that experimentation, exploration, and blind search are the only ways to answer questions about performance, curriculum design, training requirements, etc., from which we can never get more than partial, limited answers. | | | What is Needed | There exists no //universal theory of learning// -- nor of //teaching//, //training//, //task-environments//, and //evaluation//. \\ This means that experimentation, exploration, and blind search are the only ways to answer questions about a learner's performance, curriculum design, training requirements, etc., and that we can never get more than partial, limited answers to such questions. | |
| That Said... | The Pedagogical Pentagon captures the five pillars of education: Learning, Teaching, Training, Environments, and Testing. \\ It's not a theory, but rather, a conceptual framework for capturing all key aspects of education. | | | That Said... | The Pedagogical Pentagon captures the five pillars of education: Learning, Teaching, Training, Environments, and Testing. \\ It's not a theory, but rather, a conceptual framework for capturing all key aspects of education. | |
| {{http://cadia.ru.is/wiki/_media/pedagogical_pentagon_full1.png}} || | | {{http://cadia.ru.is/wiki/_media/pedagogical_pentagon_full1.png?850}} || |
| The Pedagogical Pentagon (left) captures the five main pillars of any learning/teaching situation. The relationships between its contents can be seen from various perspectives: (a) As information flow between processes. (b) As relations between systems. (c ) As dependencies between (largely missing!) theories. [[http://alumni.media.mit.edu/~kris/ftp/AGI17-pedagogical-pentagon.pdf|REF]] || | | The Pedagogical Pentagon (left) captures the five main pillars of any learning/teaching situation. The relationships between its contents can be seen from various perspectives: (a) As information flow between processes. (b) As relations between systems. (c ) As dependencies between (largely missing!) theories. [[http://alumni.media.mit.edu/~kris/ftp/AGI17-pedagogical-pentagon.pdf|REF]] || |
| \\ Tasks | Learning systems adjust their knowledge as a result of interactions with a task- environment. Defined by (possibly a variety of) objective functions, as well as (possibly) instructions (i.e. knowledge provided at the start of the task, e.g. as a "seed", or continuously or intermittently throughout its duration). Since tasks can only be defined w.r.t. some environment, we often refer to the combination of a task and its environment as a single unit: the task-environment. | | | \\ Tasks | Learning systems adjust their knowledge as a result of interactions with a task- environment. Defined by (possibly a variety of) objective functions, as well as (possibly) instructions (i.e. knowledge provided at the start of the task, e.g. as a "seed", or continuously or intermittently throughout its duration). Since tasks can only be defined w.r.t. some environment, we often refer to the combination of a task and its environment as a single unit: the task-environment. | |
| \\ Teacher | The goal of the teacher is to help a learner learn. This is done by influencing the learner’s task-environment in such a way that progress towards the learning goals is facilitated. Teaching, as opposed to training, typically involves information about the //What, Why & How:// \\ - What to pay attention to. \\ - Relationships between observables (causal, part-whole, etc.). \\ - Sub-goals, negative goals and their relationships (strategy). \\ - Background-foreground separation. | | | \\ Teacher | The goal of the teacher is to help a learner learn. This is done by influencing the learner’s task-environment in such a way that progress towards the learning goals is facilitated. Teaching, as opposed to training, typically involves information about the //What, Why & How:// \\ - What to pay attention to. \\ - Relationships between observables (causal, part-whole, etc.). \\ - Sub-goals, negative goals and their relationships (strategy). \\ - Background-foreground separation. | |
| Environment & Task | The learner and the teacher each interact with their own view of the world (i.e. their own “environments”) which are typically different, but overlapping to some degree. | | | Environment & Task | The learner and the teacher each interact with their own view of the world (i.e. their own “environments”) which are typically different, but overlapping to some degree. | |
| \\ Training | Viewed from a teacher’s and intentional learner’s point of view, “training” means the actions taken (repeatedly) over time with the goal of becoming better at some task, by avoiding learning erroneous skills/things and avoid forgetting or unlearning desirable skills/things. | | | Training | Viewed from a teacher’s and intentional learner’s point of view, “training” means the actions taken (repeatedly) over time with the goal of becoming better at some task, by avoiding learning erroneous skills/things and avoid forgetting or unlearning desirable skills/things. | |
| \\ Test | Testing - or //evaluation// - is meant to obtain information about the structural, epistemic and emergent properties of learners, as they progress on a learning task. Testing can be done for different purposes: e.g. to ensure that a learner has good-enough performance on a range of tasks, to identify strengths and weaknesses for an AI designer to improve or an adversary to exploit, or to ensure that a learner has understood a certain concept so that we can trust it will use it correctly in the future. | | | \\ Test | Testing - or //evaluation// - is meant to obtain information about the structural, epistemic and emergent properties of learners, as they progress on a learning task. Testing can be done for different purposes: e.g. to ensure that a learner has good-enough performance on a range of tasks, to identify strengths and weaknesses for an AI designer to improve or an adversary to exploit, or to ensure that a learner has understood a certain concept so that we can trust it will use it correctly in the future. | |
| Source | [[http://alumni.media.mit.edu/~kris/ftp/AGI17-pedagogical-pentagon.pdf|The Pedagogical Pentagon: A Conceptual Framework for Artificial Pedagogy]] by Bieger et al. | | | Source | [[http://alumni.media.mit.edu/~kris/ftp/AGI17-pedagogical-pentagon.pdf|The Pedagogical Pentagon: A Conceptual Framework for Artificial Pedagogy]] by Bieger et al. | |
| |
| \\ What It Is | Probability is a concept that is relevant to a situation where information is missing, which means it is a concept relevant to //knowledge//. \\ A common conceptualization of probability is that it is a measure of the likelihood that an event will occur [[https://en.wikipedia.org/wiki/Probability|REF]]. \\ If it is not know whether event **X** will be (or has been) observed in situation **Y** or not, the //probability// of **X** is the percentage of time **X** would be observed if the same situation **Y** occurred an infinite number of times. | | | \\ What It Is | Probability is a concept that is relevant to a situation where information is missing, which means it is a concept relevant to //knowledge//. \\ A common conceptualization of probability is that it is a measure of the likelihood that an event will occur [[https://en.wikipedia.org/wiki/Probability|REF]]. \\ If it is not know whether event **X** will be (or has been) observed in situation **Y** or not, the //probability// of **X** is the percentage of time **X** would be observed if the same situation **Y** occurred an infinite number of times. | |
| Conceptualization | It is useful, in the context of this course, to think about probability as 'that which is not fully known': \\ The World contains a mechanism, **M**, whose operation is not fully known, **K(M)<n** (where **n=[0,1]** and **n=1** is full knowledge). The part **m** which is unknown about **M** implies that predictions, statements, and actions that involve **M**, if repeated, may only be reliable part of the time. As repetitions tend to infinity, **m** tends to the percentage of **M** that is unknown. It is the **probability** that statements, predictions and actions about **M** are reliable. | | |
| \\ Why It Is Important \\ in AI | Probability enters into our knowledge of anything for which the knowledge is //**incomplete**//. \\ As in, //everything that humans do every day in every real-world environment//. \\ With incomplete knowledge it is in principle //impossible to know what may happen//. However, if we have very good models for some //limited// (small, simple) phenomenon, we can expect our prediction of what may happen to be pretty good, or at least //**practically useful**//. This is especially true for knowledge acquired through the scientific method, in which empirical evidence and human reason is systematically brought to bear on the validity of the models. | | | \\ Why It Is Important \\ in AI | Probability enters into our knowledge of anything for which the knowledge is //**incomplete**//. \\ As in, //everything that humans do every day in every real-world environment//. \\ With incomplete knowledge it is in principle //impossible to know what may happen//. However, if we have very good models for some //limited// (small, simple) phenomenon, we can expect our prediction of what may happen to be pretty good, or at least //**practically useful**//. This is especially true for knowledge acquired through the scientific method, in which empirical evidence and human reason is systematically brought to bear on the validity of the models. | |
| How To Compute Probabilities | Most common method is Bayesian networks, which encode the concept of probability in which probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief [[https://en.wikipedia.org/wiki/Bayesian_probability|REF]]. Which makes it useful for representing an (intelligent) agent's knowledge of some environment, task or phenomenon. | | | How To Compute Probabilities | Most common method is Bayesian networks, which encode the concept of probability in which probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief [[https://en.wikipedia.org/wiki/Bayesian_probability|REF]]. Which makes it useful for representing an (intelligent) agent's knowledge of some environment, task or phenomenon. | |
| \\ Beyes' Theorem | {{public:t-713-mers:bayes_theorem.svg?200}} \\ A,B:=events \\ P(A/B):=probability of A given B is true \\ P(B/A):=probability of B given A is true \\ P(A),P(B):=the independent probabilities of A and B | | | \\ Beyes' Theorem | {{public:t-713-mers:bayes_theorem.svg?200}} \\ A,B:=events \\ P(A/B):=probability of A given B is true \\ P(B/A):=probability of B given A is true \\ P(A),P(B):=the independent probabilities of A and B | |
| Judea Pearl | Most fervent advocate (and self-proclaimed inventor) of Bayesian Networks in AI [[http://ftp.cs.ucla.edu/pub/stat_ser/R246.pdf|REF]]. | | | Judea Pearl | Most fervent advocate (and self-proclaimed inventor) of Bayesian Networks in AI [[http://ftp.cs.ucla.edu/pub/stat_ser/R246.pdf|REF]]. | |
| | \\ Conceptualization of Probability in This Course | It is useful, in the context of this course, to think about probability as 'that which is not fully known': \\ The World contains a mechanism, **M**, whose operation is not fully known, **K(M)<n** (where **n=[0,1]** and **n=1** is full knowledge). The part **m** which is unknown about **M** implies that predictions, statements, and actions that involve **M**, if repeated, will be reliable only part of the time. As such repetitions tend to infinity, **m** tends to the percentage of **M** that is unknown, representing the **probability** that any single statement, prediction and action about **M** is reliable. | |
| | Determinism | The idea of Platonic cause-effect relations; a deterministic relationship between A and B means that there exists a universal guarantee for this relationship, for all eternity, of the inevitable and unbreakable kind. | |
| | "Adequate Determinism" | A philosophical label for the kind of 'determinism' found in the physical world. \\ See the [[https://www.informationphilosopher.com/freedom/adequate_determinism.html|Adequate Determinism]] blurb on The Information Philosopher | |
| |
\\ | \\ |
| Spurious Correlation | Non-zero correlation due to complete coincidence. | | | Spurious Correlation | Non-zero correlation due to complete coincidence. | |
| \\ Causation & Correlation | What is the relation between causation and correlation? \\ There is no (non-spurious) correlation without causation. \\ There is no causation without correlation. \\ However, correlation between two variables does not necessitate one of them to be the cause of the other: They can have a shared (possibly hidden) //common cause//. | | | \\ Causation & Correlation | What is the relation between causation and correlation? \\ There is no (non-spurious) correlation without causation. \\ There is no causation without correlation. \\ However, correlation between two variables does not necessitate one of them to be the cause of the other: They can have a shared (possibly hidden) //common cause//. | |
| |
| |
\\ | \\ |
| |
====Cumulative Learning Through Reasoning==== | ====Cumulative Learning Through Reasoning==== |
| \\ What it Is | Unifies several separate research tracks in a coherent form easily relatable to AGI requirements: Multitask learning, lifelong learning, transfer learning and few-shot learning. | | | What it Is | Unifies several separate research tracks in a coherent form easily relatable to AGI requirements: Multitask learning, lifelong learning, transfer learning and few-shot learning. | |
| \\ Multitask Learning \\ + | The ability to learn more than one task, either at once or in sequence. \\ The cumulative learner's ability to generalize, investigate, and reason will affect how well it implements this ability. \\ //Subsumed by cumulative learning because knowledge is contextualized as it is acquired, meaning that the system has a place and a time for every tiny bit of information it absorbs.// | | | \\ Multitask Learning \\ + | The ability to learn more than one task, either at once or in sequence. \\ The cumulative learner's ability to generalize, investigate, and reason will affect how well it implements this ability. \\ //Subsumed by cumulative learning because knowledge is contextualized as it is acquired, meaning that the system has a place and a time for every tiny bit of information it absorbs.// | |
| \\ Online Learning \\ + | The ability to learn continuously, uninterrupted, and in real-time from experience as it comes, and without specifically iterating over it many times. \\ //Subsumed by cumulative learning because new information, which comes in via experience, is //integrated// with prior knowledge at the time it is acquired, so a cumulative learner is //always learning// as it's doing other things.// | | | \\ Online Learning \\ + | The ability to learn continuously, uninterrupted, and in real-time from experience as it comes, and without specifically iterating over it many times. \\ //Subsumed by cumulative learning because new information, which comes in via experience, is //integrated// with prior knowledge at the time it is acquired, so a cumulative learner is //always learning// as it's doing other things.// | |